CON OUBWN FE 


NINN NNN RP RP RRP RP RP RP RP pe 
MUM BWNOPrPROWOMON DUBBPWNORPR COC OO 


. Commission Members 

. Foreword 

. Preface: Who Is the Intended Audience for This Report? 
. Executive Summary 

. What Is Cyberinfrastructure? 

. What Are the Humanities and Social Sciences? 

. What Is Digital Scholarship? 

. What Are the Distinctive Needs and Contributions of the 


Humanities and Social Sciences in Cyberinfrastructure? 


. A Grand Challenge for the Humanities and Social Sciences 
. Decades of Accelerating Change 

. Cultural Infrastructure and the Public 

. Seeing in New Ways 

. Working in New Ways 

. Ephemerality 

. The Nature of Humanities and Social Science Data 

. Copyright 

. The Conservative Culture of Scholarship 

. Culture, Value, and Communication 

. Resources 

. Framework 

. Necessary Characteristics 

. Recommendations 

. Conclusion 

. Appendix I: The Charge to the Commission 

. Appendix II: Public Information-Gathering Sessions 


Commission Members 


Commission Members 


Paul N. Courant — The Arthur F. Thurnau Professor; Professor of Public 
Policy; Professor of Economics; Professor of Information; former Provost, 
University of Michigan 


Sarah E. Fraser — Associate Professor and Chair, Department of Art History, 
Northwestern University 


Michael F. Goodchild — Director, Center for Spatially Integrated Social 
Science, and Professor of Geography, University of California, Santa 
Barbara 


Margaret Hedstrom — Associate Professor, School of Information, 
University of Michigan 


Charles Henry — Vice Provost and University Librarian, Rice University 
Peter B. Kaufman — President, Intelligent Television 


Jerome McGann — The John Stewart Bryan University Professor, University 
of Virginia 


Roy Rosenzweig — The Mark and Barbara Fried Professor of History and 
New Media, and Director, Center for History and New Media, George 
Mason University 


John Unsworth (Chair) — Dean and Professor, Graduate School of Library 
and Information Science 


University of Illinois, Urbana-Champaign 
Bruce Zuckerman — Professor of Religion, School of Religion; Director, 


West Semitic Research and InscriptiFact Projects; Director, Archaeological 
Research Collection, University of Southern California 


Editor 


Marlo Welshons — Assistant Dean for Publications and Communications, 
Graduate School of Library and Information Science, University of Illinois, 
Urbana-Champaign 


Domestic Advisors to the Commission 


Dan Atkins — Professor, School of Information, and Director, Alliance for 
Community Technology, University of Michigan 


Christine L. Borgman — Professor and Presidential Chair, Department of 
Information Studies, University of California, Los Angeles 


James Herbert — Senior NSF/NEH Advisor, National Science Foundation 
Clifford Lynch — Director, Coalition for Networked Information 


Deanna Marcum — Associate Librarian for Library Services, Library of 
Congress 


Abby Smith — Independent Consultant and former Director of Programs, 
Council on Library and Information Resources 


Steven C. Wheatley — Vice President, American Council of Learned 
Societies 


International Advisors to the Commission 


Sigrun Eckelmann — Programmdirektorin, Organisationseinheit, Bereich 
Wissenschaftliche Informationssysteme, Deutsche Forschungsgemeinschaft 


Muriel Foulonneau — French Ministry of Culture; Minerva Project; 
European Commission, Visiting Assistant Professor, University of Illinois, 
Urbana-Champaign 


Stefan Gradmann — Stellvertretender Direktor, Regionales Rechenzentrum, 
Universitat Hamburg, Hamburg, Germany 


Bjorn Henrichsen — Administrative Director and Executive Director, Norsk 
samfunnsvitenskapelig datatjeneste AS (NSD)/Norwegian Social Science 
Data Services Ltd., Bergen, Norway 


Michael Jubb — Director of Policy and Programmes, Arts and Humanities 
Research Board, Bristol, United Kingdom 


Jaap Kloosterman — International Institute of Social History, Amsterdam, 
The Netherlands 


David Moorman — Senior Policy Advisor / Conseiller principal des 
politiques, Social Sciences and Humanities Research Council/Conseil de 


recherches en sciences humaines du Canada, Ottawa, Canada 


David Robey — Programme Director, ICT in Arts and Humanities Research, 
Arts and Humanities Research Board 


School of Modern Languages, University of Reading, Reading, England 


Harold Short — Director, Centre for Computing in the Humanities, King's 
College London, London, United Kingdom 


Colin Steele — Emeritus Fellow; University Librarian (1980-2002); 


Director, Scholarly Information Strategies (2002-2003), The Australian 
National University, Canberra, Australia 


Public Information-Gathering Meetings 
April 27, 2004, Washington DC 

May 22, 2004, Evanston, IL 

June 19, 2004, New York, NY 


August 21, 2004, Berkeley, CA 


September 18, 2004, Los Angeles, CA 


October 26, 2004, Baltimore, MD 


Testimony and Background Materials 


Foreword 


Foreword 


I am pleased to commend Our Cultural Commonwealth to what I hope will 
be the many readers who will find in the report a vision of the future and a 
guide to realizing that future. 


One role of the American Council of Learned Societies is to convene 
scholars and institutional leaders to consider challenges important to the 
advancement of humanistic studies in all fields. The effective and efficient 
implementation of digital technologies is precisely such a challenge. It is 
increasingly evident that new intellectual strategies are emerging in 
response to the power of digital technologies to support the creation of 
humanistic knowledge. Innovative forms of writing and image creation 
proliferate in arts and letters, with many new works accessible and 
understood only through digital media. Scholars are increasingly dependent 
on sophisticated systems for the creation, curation, and preservation of 
information. In 2004, therefore, ACLS asked John Unsworth, Dean of the 
Graduate School of Library and Information Science, University of Illinois, 
Urbana-Champaign, to chair a Commission on Cyberinfrastructure in the 
Humanities and Social Sciences. Dean Unsworth selected the other 
members of the Commission and its advisers, who worked with dedication 
and determination. The analysis and recommendations of this report are 
theirs, but the responsibility for grappling with the issues they present lies 
with the wider community of scholarship and education. 


The convergence of advances in digital technology and humanistic 
scholarship is not new. Indeed, this publication is at least the sixth major 
report focused on technology and scholarship in the humanities and 
interpretive social sciences issued by our Council. [footnote]In 1965, ACLS 
began a program of providing fellowships to scholars whose projects 
experimented with “computer aided research in the humanities.” A forty- 
year-old statement of that program’s purpose remains convincing: “Of 
course computers should be used by scholars in the humanities, just as 
microscopes should be used by scientists. . . [t]he facts and patterns that 
they—and often they alone—can reveal should be viewed not as the 


definitive answers to the questions that humanists have been asking, but 
rather as the occasion for a whole range of new and more penetrating and 
more exciting questions.” [footnote]For the past forty years increasing 
numbers of individual scholars have validated and re-validated that 
assertion. We now have atrived at the point, however, where we cannot rely 
on individual enterprise alone. This report is therefore primarily concerned 
not with the technological innovations that now suffuse academia, but 
rather with institutional innovations that will allow digital scholarship to be 
cumulative, collaborative, and synergistic. 

Herbert C. Morton and Anne J. Price, The ACLS Survey of Scholars: The 
Final Report of Views on Publications, Computers, and Libraries 
(Washington: University Press of America, 1989). Herbert C. Morton et al, 
Writings on Scholarly Communication: An Annotated Bibliography of 
Books and Articles on Publishing, Libraries, Scholarly Research, and 
Related Issues (University Press of America, 1988). Scholarly 
Communication: The Report of the National Enquiry, (John Hopkins 
University Press, 1979). “Computerized Research in the Humanities,” 
ACLS Newsletter, Special Supplement, June 1966. Pamela Pavliscak, 
Seamus Ross, and Charles Henry, “Information Technology in Humanities 
Scholarship: Achievements, Prospects, and Challenges—The United States 
Focus,” ACLS Occasional Paper #37,1997. 

Charles Blitzer, “This Wonderful Machine: Some Thoughts on the 
Computer and the Humanities,” ACLS Newsletter, Vol. XVII, April 1966, 
No. 4. 


Those institutional innovations are the “cyberinfrastructure” advocated by 
the following pages. We are grateful to the National Science Foundation 
and to Dan Atkins, who chaired the NSF Advisory Panel on 
Cyberinfrastructure that issued in 2003 a report on the subject, for giving 
the term currency and meaning. (Dr. Atkins also served as an adviser to the 
ACLS Commission.) In addition to the “Atkins report,” the NSF 
commissioned a report on the cyberinfrastructure needs of the more 
quantitative social sciences. [footnote ]With the publication of Our Cultural 
Commonwealth, which concerns the humanities and interpretive social 
sciences, we now have all of the fields of the arts and sciences in common 
cause. 


Francine, Berman and Henry Brady, “Final Report: NSF SBE-CISE 
Workshop on Cyberinfrastructure and the Social Sciences” 
www.sdsc.edu/sbe/ . 


ACLS’s earlier reports focused within the academy and concerned the 
potential of new information technologies to empower research on 
traditional objects of study. That orientation is the starting point for this 
effort, and the evidence there is compelling. But the widespread social 
adoption of computing is transforming the very subjects of humanistic 
inquiry. In 2006 most expressions of human creativity in the United States 
—writing, imaging, music—will be “born digital.” The intensification of 
computing as a cultural force makes the development of a robust 
cyberinfrastructure an imperative for scholarship in the humanities and 
social sciences. Political scientists must take account not only of polling 
data, but of the blogesphere. Architectural historians must be able to 
analyze computer-aided design. What we once called “film studies” 
increasingly will be research on digital media. If these materials are to be 
preserved and accessible, if they are to be searched and analyzed, we must 
have the human and institutional capacities called for in this report. 


Many thanks are in order. The Andrew W. Mellon Foundation provided 
essential resources: the Foundation’s financial support made the report 
possible, and the advice and counsel of Program Officer Donald J. Waters 
helped refine it. Many institutions extended themselves in providing venues 
for the public sessions that helped form the report: the New York Public 
Library; Northwestern University; the University of California, Berkeley; 
the University of Southern California; the Research Libraries Group; the 
Institute of Museum and Library Services. Numerous scholarly leaders gave 
presentations to the Commission, and many others submitted comments on 
earlier drafts of this report. I wish to express thanks also to Abby Smith, 
who served first as Senior Editor and subsequently as an adviser to the 
Commission; to Marlo Welshons, the report’s editor, who worked tirelessly 
yet cheerfully to bring together the words and ideas of the report’s many 
authors and reviewers; and to Sandra Bradley, who helped maintain the 
Commission’s own infrastructure. 


This report addresses its recommendations to college and university leaders, 
to funders, to scholars, and to the public that ultimately supports the 
scholarly and educational enterprise. It is heartening to know that some of 
the recommendations of the report already are being acted upon. With the 
support of the Mellon Foundation, ACLS has begun offering Digital 
Innovation Fellowships designed to advance digital scholarship and to 
exemplify the infrastructure necessary for further advances. Chairman 
Bruce Cole’s announcement of the Digital Humanities Initiative of the 
National Endowment for the Humanities is especially promising. One early 
fruit of that initiative is a new partnership between the Endowment and the 
Institute for Museum and Library Services to help teachers, scholars, 
museums, and libraries work together to advance digital scholarship and 
present it to the widest possible public. The John D. And Catherine T. 
MacArthur Foundation has begun a major new effort to understand and 
develop digital technologies for learning in early education. We can hope 
that other foundations and funders will join the Mellon Foundation in 
extending that focus to higher education. The ACLS remains committed to 
continuing its work in this area through the direct support of scholars and 
by cooperating with our member societies in hopes of providing leadership 
in this rapidly changing domain. 


“Commonwealth” is defined both as “a body or number of persons united 
by a common interest,” and as the “public welfare, general good or 
advantage.” With this report the former meaning, as represented by the 
Commission and ACLS, presents a framework for action that, we believe, 
will advance the latter, the general good. 


Pauline Yu 
President 


American Council of Learned Societies 


Preface: Who Is the Intended Audience for This Report? 


Preface: Who Is the Intended Audience for This Report? 
This report is addressed to several related audiences: 


e Senior scholars in the humanities and social sciences in a university 
setting, who have the power to change scholarly practice and the 
responsibility to exercise that power. These individuals need to address 
themselves to their national and professional representatives and, 
locally, to their colleagues, their academic deans, provosts, and 
presidents. 

e Leaders of national academies, scholarly societies, university presses, 
and research libraries, museums, and archives, who share the power 
and responsibility of senior scholars and who can speak to leaders at 
the campus, state, and national levels. 

e University provosts, presidents, and boards of trustees, who must 
decide in the coming decade what strategic investments to make with 
the limited resources of their institutions and who can influence 
legislators. 

e Legislators at the local, state, and national level charged with making 
decisions about funding for public schools, public community 
colleges, public universities, and federally supported research, who 
have the same responsibility to the public with respect to 
cyberinfrastructure as they do for physical infrastructure, and for the 
same reasons—because ultimately, good infrastructure promotes good 
citizenship and good government by promoting tolerance, 
understanding, and prosperity. 

e Federal agencies and private foundations that promote research in the 
humanities and social sciences. These organizations have the power to 
influence individual scholars directly, as well as university provosts, 
university presses, and scholarly societies. 

e Lifelong learners outside the academy who have an abiding interest in 
the pursuit of knowledge in the humanities and social sciences, 
including those who enjoy visiting museums and public libraries or 
informing themselves by reading a book or surfing the Web. Such 
individuals give voice to the intelligence of the general public and, 


through their active support and interest in self-education, can 
influence legislation and funding at the campus, local, state, and 
national levels, simply by making themselves heard. 


Finally, it is important to note that each of these audiences has a 
responsibility to carry the message of the report to other, broader audiences. 
Without the active participation such a process implies, this report cannot 
effect change. 


Executive Summary 


Executive Summary 


The emergence of the Internet has transformed the practice of the 
humanities and social sciences—more slowly than some may have hoped, 
but more profoundly than others may have expected. Digital cultural 
heritage resources are a fundamental dataset for the humanities: these 
resources, combined with computer networks and software tools, now shape 
the way that scholars discover and make sense of the human record, while 
also shaping the way their findings are communicated to students, 
colleagues, and the general public. Even greater transformations are on the 
horizon, as digitized cultural heritage comes into its own. But we will not 
see anything approaching complete digitization of the record of human 
culture, removal of legal and technical barriers to access, or revolutionary 
change in the academic reward system unless the individuals, institutions, 
enterprises, organizations, and agencies who are this generation’s stewards 
of that record make it their business to ensure that these things happen. 


The organized use of networks and computation for the practice of science 
and engineering was the subject of a 2003 report to the National Science 
Foundation (NSF), Revolutionizing Science and Engineering through 
Cyberinfrastructure. [footnote]In both the NSF report and this one, the term 
cyberinfrastructure is meant to denote the layer of information, expertise, 
standards, policies, tools, and services that are shared broadly across 
communities of inquiry but developed for specific scholarly 
purposes:cyberinfrastructure is something more specific than the network 
itself, but it is something more general than a tool or a resource developed 
for a particular project, a range of projects, or, even more broadly, for a 
particular discipline. So, for example, digital history collections and the 
collaborative environments in which to explore and analyze them from 
multiple disciplinary perspectives might be considered cyberinfrastructure, 
whereas fiber-optic cables and storage area networks or basic 
communication protocols would fall below the line for cyberinfrastructure. 
National Science Foundation, Revolutionizing Science and Engineering 
through Cyberinfrastructure: Report of the National Science Foundation 


Blue-Ribbon Advisory Panel on Cyberinfrastructure (January 2003) 


Recognizing that a revolution similar to the transformation of science and 
engineering addressed in the NSF report is inevitable for the humanities and 
the social sciences and that these disciplines have essential and distinct 
contributions to make in designing, building, and operating 
cyberinfrastructure, the American Council of Learned Societies (ACLS) in 
2004 appointed a Commission on Cyberinfrastructure for the Humanities 
and Social Sciences. This report reflects the reach of its sponsoring 
organization, the ACLS, by focusing on the needs of the humanities and 
nonnormative social sciences, that is, social sciences that are interpretive. 


The ACLS Commission was charged with three tasks: 


1. To describe and analyze the current state of humanities and social 
science cyberinfrastructure 

2. To articulate the requirements and potential contributions of the 
humanities and social sciences in developing a cyberinfrastructure for 
information, teaching, and research 

3. To recommend areas of emphasis and coordination for the various 
agencies and institutions, public and private, that contribute to the 
development of this infrastructure 


Commission members include humanities scholars, social scientists, 
administrators, and entrepreneurs from universities and organizations public 
and private, large and small. All were selected for their experience with 
digital technologies. The Commission’s deliberations were informed by the 
testimony of scholars, librarians, museum directors, social scientists, 
representatives of government and private funding agencies, and many 
other people, gathered in a series of public meetings held in Washington, 
DC; New York City; Chicago; Berkeley; Los Angeles; and Baltimore 
during 2004; by national and international reports by other groups on 
related missions; by advisors to the Commission, selected for particularly 
relevant expertise; and by responses to the draft report, which was made 
available for public comment from November 2005 through January 2006. 


The Commission heard from those who wanted more advanced software 
applications, greater bandwidth, and more access to expertise in information 
technology. We also heard from many who spoke about the potential for 
cyberinfrastructure to enhance teaching, facilitate research collaboration, 
and increase public access to (and fair use of) the record of human cultures 
across time and space. As a result, this report addresses the particular needs 
and contributions of those directly engaged in teaching, research, and 
cultural work; but it also places those needs and contributions in a larger 
context, namely, the public good that these activities, collectively, produce. 


As more personal, social, and professional time is spent online, it will 
become increasingly important to have an online environment that 
cultivates the richness of human experience, the diversity of human 
languages and cultures, and the full range of human creativity. Such an 
environment will best emerge if its design can benefit from the strengths of 
the humanities and social sciences: clarity of expression, the ability to 
uncover meaning even in scattered or garbled information, and centuries of 
experience in organizing knowledge. These strengths are especially 
important as the volume of digital resources grows, as complexity 
increases, and as we struggle to preserve and make sense of billions of 
sources of information. 


Many who work in the humanities and social sciences have come to 
recognize that knowledge in these disciplines is on the edge of some 
fundamental changes, and that it would be better to approach these changes 
with specific goals in mind. This report suggests what some of those goals 
might be. The Introduction answers a few fundamental questions: What is 
cyberinfrastructure? What do we mean when we refer to the humanities and 
social sciences? And what are the distinctive needs and contributions of 
these disciplines in cyberinfrastructure? 


As the title of this report is meant to indicate, the online world is a new 
cultural commonwealth in which knowledge, learning, and discovery can 
flourish. Our aim, therefore, is to show how best to achieve this cultural 
commonwealth for the betterment of all. 


Chapter 1 makes the case for the transformative potential of an improved 
cyberinfrastructure with respect to the preservation and availability of our 


cultural heritage. A coordinated effort to build cyberinfrastructure for the 
humanities and social sciences, the Commission argues, will benefit the 
public and the specialist alike by providing access to the breadth and depth 
of the cultural record. 


Chapter 2 explores the constraints that must be overcome in creating such a 
cyberinfrastructure—insufficient training, outdated policies, unsatisfactory 
tools, incomplete resources, and inadequate access. These constraints are 
not primarily technological but, instead, cultural, economic, legal, and 
institutional. They include: 


ethe loss, fragility, and inaccessibility of the cultural record; 
ethe complexity of the cultural record; 
eintellectual property restrictions on the use of the cultural record; 


elack of incentives to experiment with cyberinfrastructure in the humanities 
and social sciences; 


euncertainty about the future mechanisms, forms, and economics of 
scholarly publishing and scholarly communication more generally; 


einsufficient resources, will, and leadership to build 
cyberinfrastructure for the humanities and social sciences. 


Chapter 3 provides a framework for action. It first articulates five goals for 
an effective cyberinfrastructure, namely, that it should 


1. be accessible as a public good; 
2. be sustainable; 

3. provide interoperability; 

4. facilitate collaboration; 

9. Support experimentation. 


In chapter 3, the Commission also recommends the following measures 
necessary to achieve these goals (and to meet the challenges described in 
chapter 2): 


1. Invest in cyberinfrastructure for the humanities and social sciences, 
as a matter of strategic priority. 


Addressed to: Universities; federal and private funding agencies 


Implementation: Determine the amount and efficacy of funding that now 
goes to support developing cyberinfrastructure for humanities and social 
sciences from all sources; through annual meetings and ongoing 
consultation, coordinate the goals this funding aims to achieve; and aim to 
increase both funding and coordination over the next five years, including 
commercial investments that are articulated with the educational 
community’s agenda. 


2. Develop public and institutional policies that foster openness and 
access. 


Addressed to: University presidents, boards of trustees, provosts, and 
counsels; university presses; funding agencies; libraries; scholarly societies; 
Congress 


Implementation: The Association of American Universities, in 
collaboration with other organizations such as the National Humanities 
Alliance, the Scholarly Publishing and Research Coalition, and the National 
Academy of Arts and Sciences, should take a leadership role in 
coordinating the engagement of the humanities and social sciences with 
issues of information policy. 


3. Promote cooperation between the public and private sectors. 


Addressed to: Universities; federal and private funding agencies; Internet- 
oriented companies 


Implementation: A private foundation, a federal funding agency, an Internet 
business, and one or more university partners should cosponsor recurring 
annual summits to explore new models for commercial/nonprofit 


partnerships and to discuss opportunities for the focused creation of digital 
resources with high educational value and high public impact. 


4. Cultivate leadership in support of cyberinfrastructure from within 
the humanities and social sciences. 


Addressed to: Senior scholars; scholarly societies; university 
administrators; senior research librarians and research library organizations; 
academic publishing organizations; federal funding agencies; private 
foundations 


Implementation: Increase federal and foundation funding to one or more 
scholarly organizations in the area of humanities and social science 
computing so that they can work with member organizations of the 
American Council of Learned Societies (ACLS) and others to establish 
priorities for cyberinfrastructure development, raise awareness of research 
and partnership opportunities among scholars, and coordinate the evolution 
of research products from basic to applied. 


5. Encourage digital scholarship. 


Addressed to: Universities; research libraries; the National Endowment for 
the Humanities (NEH); the National Endowment for the Arts (NEA); the 
Institute of Museum and Library Services (IMLS); the National Academies; 
the National Archives; major private foundations; major scholarly societies; 
individual leaders in the humanities and social sciences 


Implementation: Federal funding agencies and private foundations should 
establish programs that address workforce issues in digital humanities and 
social sciences, from short-term workshops to postdoctoral and research 
fellowships to the cultivation of appropriately trained computer 
professionals. The ACLS should lead its member organizations in 
developing uniform policies with respect to digital scholarship in tenure and 
promotion. 


6. Establish national centers to support scholarship that contributes to 
and exploits cyberinfrastructure. 


Addressed to: Universities; Congress; state legislatures; public funding 
agencies; private foundations 


Implementation: Universities should develop national and international 
fellowships at existing humanities and social science computing centers, 
and develop new centers with such programs, with a combination of 
university, federal, and private funding. 


7. Develop and maintain open standards and robust tools. 


Addressed to: Funding agencies, public and private; scholars; librarians; 
curators; publishers; technologists 


Implementation: University consortia such as the Committee on 
Institutional Cooperation should license the SourceForge software and 
make it available to open-source developers in academic institutions. The 
National Endowment for the Humanities (NEH), National Archives and 
Records Administration (NARA), and the Institute of Museum and Library 
Services (IMLS) should support the development, maintenance, and 
coordination of community-based standards such as the Text Encoding 
Initiative, Encoded Archival Description, Metadata Encoding and 
Transmission Standard, and Visual Resources Data Standards. The National 
Science Foundation (NSF), the Andrew W. Mellon Foundation, the IMLS, 
and other funding agencies should support the development of tools for the 
analysis of digital content. 


8. Create extensive and reusable digital collections. 


Addressed to: The National Endowment for the Arts (NEA), the National 
Endowment for the Humanities (NEH), the Institute of Museum and 
Library Services (IMLS), the National Archives and Records 
Administration (NARA), and other funding agencies, both public and 


private; scholars; research libraries and librarians; university presses; 
commercial publishers 


Implementation: National centers with a focus on particular methods or 
disciplines can organize a certain amount of scholar-driven digitization. 
Library organizations and libraries should sponsor discipline-based focus 
groups to discuss priorities with respect to digitization. When priorities are 
established, these should be relayed to the organizers of annual meetings on 
commercial and nonprofit partnerships, and they should be considered in 
the distribution of grant funds by federal agencies and private foundations. 
Funding to support the maintenance and coordination of standards will 
improve the reusability of digital collections. The NEA, NEH, and IMLS 
should work together to promote collaboration and skills development— 
through conferences, workshops, and/or grant programs—for the creation, 
management, preservation, and presentation of reusable digital collections, 
objects, and products. 


Finally, in light of these requirements and in order to realize the promise of 
cyberinfrastructure for research and education, the Commission calls for 
specific investments—not just of money but also of leadership—from 
scholars and scholarly societies; librarians, archivists, and curators; 
university provosts and university presses; the commercial sector; 
government; and private foundations. 


What Is Cyberinfrastructure? 


What Is Cyberinfrastructure? 


We need first to define our terms—especially the term that is most essential 
to this report: cyberinfrastructure. The infrastructure of scholarship was 
built over centuries. It includes diverse collections of primary sources in 
libraries, archives, and museums; the bibliographies, searching aids, citation 
systems, and concordances that make that information retrievable; the 
standards that are embodied in cataloging and classification systems; the 
journals and university presses that distribute the information; and the 
editors, librarians, archivists, and curators who link the operation of this 
structure to the scholars who use it. All of these elements have extensions 
or analogues in cyberinfrastructure, at least in the cyberinfrastructure that is 
required for humanities and social sciences. 


The 2003 National Science Foundation report Revolutionizing Science and 
Engineering through Cyberinfrastructure (hereafter referred to as the 
“Atkins report,” after Dan Atkins, who chaired the committee that produced 
it) described cyberinfrastructureas a “layer of enabling hardware, 
algorithms, software, communications, institutions, and personnel” that lies 
between a layer of “base technologies . . . the integrated electro-optical 
components of computation, storage, and communication” and a layer of 
“software programs, services, instruments, data, information, knowledge, 
and social practices applicable to specific projects, disciplines, and 
communities of practice.” In other words, for the Atkins report (and for this 
one), cyberinfrastructure is more than a tangible network and means of 
storage in digitized form, and it is not only discipline-specific software 
applications and project-specific data collections. It is also the more 
intangible layer of expertise and the best practices, standards, tools, 
collections and collaborative environments that can be broadly shared 
across communities of inquiry. “This layer,” as the Atkins report notes, 
“should provide an effective and efficient platform for the empowerment of 
specific communities of researchers to innovate and eventually 
revolutionize what they do, how they do it, and who participates.” As the 
NSF panel issuing that report further noted, “if infrastructure is required for 


an industrial economy, then we could say that cyberinfrastructure is 
required for a knowledge economy.” 


One characteristic of infrastructure is that it is deeply embedded in the way 
we do our work. When it works efficiently, it is invisible: we use it without 
really thinking about it. When we drive a car, we rely on an infrastructure 
that includes physical systems of minor and major roads; societal and 
governmental systems for licensing drivers, setting speed limits, and 
codifying driver conduct; and economic systems of license fees and 
gasoline taxes to maintain and expand the roads. The technical and societal 
systems that make up cyberinfrastructure will need to support the entire 
range of research goals, legal requirements, and objects of attention for the 
natural sciences, social sciences, and humanities. 


Infrastructure becomes an installed base on which other things are built. 
Because it is extensive and expensive, infrastructure tends to be built 
incrementally, not all at once nor everywhere at once. [footnote ]In the 
humanities and social sciences, we have been building extensive and widely 
used collections—digital libraries—over the last fifteen years or more, and 
we have been developing standards for expressing, exchanging, and 
preserving these collections. Now it is time to build the tools that will 
enable new learning and teaching and to develop new audiences who can 
benefit from this scholarship. 

Susan Leigh Star and Karen Ruhleder, “Steps toward an Ecology of 
Infrastructure,” Information Systems Research 7.1 (1999): 111-34. 


What Are the Humanities and Social Sciences? 


What Are the Humanities and Social Sciences? 


One definition of the humanities is provided in the National Foundation on 
the Arts and the Humanities Act [footnote ]of 1965: 

National Endowment for the Arts 

http://arts.endow. gov/about/Legislation/Legislation. html. 


"The term “humanities” includes, but is not limited to, the study of the 
following: language, both modern and classical; linguistics; literature; 
history; jurisprudence; philosophy; archaeology; comparative religion; 
ethics; the history, criticism and theory of the arts; those aspects of social 
sciences which have humanistic content and employ humanistic methods; 
and the study and application of the humanities to the human environment 
with particular attention to reflecting our diverse heritage, traditions, and 
history and to the relevance of the humanities to the current conditions of 
national life." 


The social sciences, as they are understood in this report, are actually more 
difficult to define. The American Council of Learned Societies represents 
“interpretive” social sciences, that is, social sciences that are more 
qualitative than quantitative in their methods. But the Commission is not 
interested in staking out territory, nor does it seem necessary that there 
should be a one-to-one correspondence between disciplines and 
commissions or their reports: indeed, the twenty-seven reports on 
cyberinfrastructure currently listed on the NSF Web page devoted to 
“Cyberinfrastructure and Its Impacts” [footnote |make clear that the blurring 
of these boundaries is one of the characteristics of cyberinfrastructure. If the 
emerging cyberinfrastructure is to support creativity, inquiry, and the 
broadest increase of knowledge, it must include the contributions of the 
humanities and the interpretive social sciences. 

National Science Foundation http://www.nsf.gov/od/oci/reports. jsp. 


What Is Digital Scholarship? 


What Is Digital Scholarship? 
In recent practice, "digital scholarship" has meant several related things: 


1. Building a digital collection of information for further study and 
analysis 

2. Creating appropriate tools for collection-building 

3. Creating appropriate tools for the analysis and study of collections 

4. Using digital collections and analytical tools to generate new 
intellectual products 

5. Creating authoring tools for these new intellectual products, either in 
traditional forms or in digital form 


It may seem odd to some that creating collections and the tools to use them 
should be counted as scholarship, but humanities and social science 
research has always required collections of appropriate information, and 
throughout history, scholars have often been the ones to assemble those 
collections, as part of their scholarship. Moreover, scholars have been 
building tools since the first index, the first concordance, the first scholarly 
edition. So, while it is reasonable to regard (d) as the core meaning and 
ultimate objective of “digital scholarship,” it is also important to recognize 
that in the early digital era, leadership may well consist of collection- 
building or tool-building. In addition, tool-building is dependent on the 
existence of collections, and both collections and tools get better and more 
general as there is more use of digital information. If we hope to see new 
intellectual products, we should give high priority to building tools and 
collections. Finally, it is worth noting that although (a), (b), (c), and (e) 
require a great deal of cooperation, it is still imaginable that (d) can be the 
work of a single individual. 


What Are the Distinctive Needs and Contributions of the Humanities and 
Social Sciences in Cyberinfrastructure? 


What Are the Distinctive Needs and Contributions of the 
Humanities and Social Sciences in Cyberinfrastructure? 


In the National Foundation on the Arts and Humanities Act of 
1965[footnote|—the legislation that created the National Endowment for 
the Humanities—two of the leading arguments presented for the act are: 
National Endowment for the Humanities 
http://www.neh.gov/nehat40/founding/legislation. html. 


"(3) An advanced civilization must not limit its efforts to science and 
technology alone, but must give full value and support to the other great 
branches of scholarly and cultural activity in order to achieve a better 
understanding of the past, a better analysis of the present, and a better view 
of the future." 


"(4) Democracy demands wisdom and vision in its citizens. It must 
therefore foster and support a form of education, and access to the arts and 
the humanities, designed to make people of all backgrounds and wherever 
located masters of their technology and not its unthinking servants. " 


Both of these arguments remain true as we enter into an “advanced 
civilization” that depends on technology for the daily business of the culture 
as well as for its education and its research. The humanities and the social 
sciences are critical players in the development of cyberinfrastructure 
because they deal with the intractability, the rich ambiguity, and the 
magnificent complexity that is the human experience. 


In the Atkins report, cyberinfrastructure consists of 


e grids of computational centers; 

¢ comprehensive libraries of digital objects; 
e well-curated collections of scientific data; 
e online instruments and vast sensor arrays; 
* convenient software toolkits. 


Humanities scholars and social scientists will require similar facilities but, 
obviously, not exactly the same ones: “grids of computational centers” are 
needed in the humanities and social sciences, but they will have to be 
staffed with different kinds of subject-area experts; comprehensive and 
well-curated libraries of digital objects will certainly be needed, but the 
objects themselves will be different from those used in the sciences; 
software toolkits for projects involving data-mining and data-visualization 
could be shared across the sciences, humanities, and social sciences, but 
only up to the point where the nature of the data begins to shape the nature 
of the tools. Science and engineering have made great strides in using 
information technology to understand and shape the world around us. This 
report is focused on how these same technologies could help advance the 
study and interpretation of the vastly more messy and idiosyncratic realm of 
human experience. 


Building a cyberinfrastructure for the humanities and social sciences 
presents an opportunity to take advantage of prevailing economic, 
organizational, and technological forces. We have remarkable opportunities 
to bring new analytic and interpretive power to bear on the materials and 
the methods of the humanities and the social sciences: by so doing, we can 
advance our understanding of human cultures past, present, and future. In 
the process, however, scholars, librarians, publishers, and universities will 
also have to re-examine their own academic culture, rethinking its outward 
forms, its established practices, and its apparent assumptions. 


The case for why and how to seize this opportunity is presented in the 
following chapters. Chapter 1 articulates a vision for the future of the 
humanities and social sciences. Chapter 2 highlights some of the 
fundamental constraints that could limit our ability to achieve that vision. 
Chapter 3 presents a framework for making the changes needed to 
overcome those constraints and for undertaking the online integration of the 
cultural record. 


A Grand Challenge for the Humanities and Social Sciences 


A Grand Challenge for the Humanities and Social Sciences 


In the 1970s experimental networks emerged from the university and were, 
at first gingerly, picked up by the general public. At this stage the most 
interesting applications for these networks came out of the university world: 
the Ethernet protocol was developed in Robert Metcalfe’s (initially 
unsuccessful) Harvard dissertation (1973); twenty years later, in April 1993, 
Mosaic—the first graphical web browser, from which are descended all 
other browsers that we use today—was released from the National Center 
for Supercomputing Applications (NCSA) at the University of Illinois, 
Urbana-Champaign. In the next year, Web traffic grew at an annual rate of 
341,634%. [footnote|By 2004, just about a decade after Mosaic, the 
networks had become completely public in nature, and they are now 
thoroughly naturalized by the public. According to the Pew Internet & 
American Life Project, more than 60% of Americans are online: 

Hobbes' Internet Timeline v8.0 
http://www.zakon.org/robert/internet/timeline/. 


"On a typical day at the end of 2004, some 70 million American adults 
logged onto the Internet to use email, get news, access government 
information, check out health and medical information, participate in 
auctions, book travel reservations, research their genealogy, gamble, seek 
out romantic partners and engage in countless other activities. That 
represents a 37% increase from the 52 million adults who were online on an 
average day in 2000 when the Pew Internet & American Life Project began 
its study of online life. .. . The Web has become the “new normal” in the 
American way of life; those who don’t go online constitute an ever- 
shrinking minority." 


By 2005, the Pew Survey reports, the percentage of American adults online 
had increased—in one year—from 60% to 73%. [footnote |But it is 
teenagers (12-17) who have the highest share of Internet participation (87% 
are online): they regard e-mail as “something for ‘old people,’” and they 
have “embraced the online applications that enable communicative, 
creative, and social uses. [They] are significantly more likely than older 


users to send and receive instant messages, play online games, create blogs, 
download music, and search for school information.” [footnote | 


The challenge for scholars and teachers is to ensure that they engage this 
outpouring of creative energy, seize this openness to learning, and lead 
rather than follow in the design of this new cultural infrastructure. And, in 
fact, over the last fifty years, a small but growing number of scholars in the 
humanities and social sciences have been using digital tools and 
technologies with increasing sophistication and innovation, transforming 
their practices of collaboration and communication. Some have been true 
media pioneers, testing the limits of the systems, policies, and funding 
sources that support digital scholarship. These digital groundbreakers have 
provided breathtaking views into what could be achieved with a more 
robust humanities and social science cyberinfrastructure. What new heights 
would be reached if a leveraged, coordinated investment, as outlined in this 
report, were undertaken? 


Were such an infrastructure available, scholars would not be the only 
beneficiaries: everyone online could explore connections within a cultural 
record that is now scattered across libraries, archives, museums, galleries, 
and private collections around the world, under varying conditions of 
stability and accessibility. A better understanding of ourselves, our world, 
and our past would result, as well as a richer framework for learning and 
scholarship. 


In spite of high-profile efforts such as Google Book Search, [footnote |most 
of the human record has not yet been digitized, nor is it likely to be for 
some time to come. For the humanities and social sciences, then, an 
effective cyberinfrastructure will have to support the computer-assisted use 
of both physical and digital resources, and it will have to enable 
communication and collaboration using a range of digital surrogates for 
physical artifacts; in fact, it will have to embody an understanding of the 
continuity between digital and physical, rather than promoting the notion 
that the two are distinct from or opposed to one another. A 
cyberinfrastructure for humanities and social sciences must encourage 


interactions between the expert and the amateur, the creative artist and the 
scholar, the teacher and the student. It is not just the collection of data— 
digital or otherwise—that matters: at least as important is the activity that 
goes on around it, contributes to it, and eventually integrates with it. 


Creating such an infrastructure is a grand challenge for the humanities and 
social sciences, and indeed for the academy, the nation, and the world, 
because a digitized cultural heritage is not limited by or contained within 
disciplinary boundaries, individual institutions, or national borders. The 
resources that make up our cultural record are often found far from the site 
of their creation and use, carried off as spoils of war, relocated to museum 
exhibitions or storage, or hidden away in private collections. We now have 
an opportunity to create an integrated digital representation of the cultural 
record, connecting its disparate parts and making the resulting whole more 
available to one and all, over the network. 


Creating this integrated, networked cultural record will require intensive 
collaboration among scholars as well as cooperation with librarians, 
curators, and archivists; the involvement of experts in the sciences, law, 
business, and entertainment; and active participation from and endorsement 
by the general public. Enabling anything like seamless access to the cultural 
record will require developing tools to navigate among vast catalogs of 
born-digital and digitized materials, as well as the records of physical 
materials: it will also require addressing daunting problems in digital 
preservation, copyright, and economic sustainability. The return on this 
investment will be a humanities and social science cyberinfrastructure that 
will allow new questions to be asked, new patterns and relations to be 
discerned, and deep structures in language, society, and culture to be 
exposed and explored. 


Librarians, curators, archivists, and the private sector are already joining 
forces with the objective of creating universal access to knowledge 
anywhere and everywhere. The Open Content Alliance has shown that 
commercial, nonprofit, and university content creators can cooperate in 
powerful ways to increase open access to cultural resources. Google has as 
its stated mission “to organize the world's information and make it 


universally accessible and useful”—albeit not on open-access terms. From a 
technical perspective, Google Book Search has shown that we can digitize 
collections of millions of books, although it needs to be acknowledged that 
even those millions of books constitute only a tiny fraction of the cultural 
record that exists in archives, museums of all types, and rare book 
collections as well as, of course, in music, visual arts, maps, photography, 
movies, radio, television, video games, and other forms of new media. 


Librarians speak increasingly today of building the “global digital library,” 
while museum curators talk of “heading toward a kind of digital global 
museum”; archivists have been experimenting with virtual finding aids that 
provide unified online access to records that are physically dispersed. 
[footnote] Yet the digital medium is compelling and effective not just 
because it integrates materials otherwise divided in space and time, but also 
because it integrates these various genres in ways that make it possible to 
extend study relatively seamlessly across them. Every day, these nontextual 
materials proliferate faster than does text, and every day, they grow in 
importance to fields throughout the humanities and social sciences. Our 
communications environment already includes not just text but still and 
moving images, audio files, and social interactivity forums, making it 
imperative that the humanities and social sciences be included in the 
process of designing cyberinfrastructure. 

See Deanna Marcum, “The Sum of the Parts: Turning Digital Library 
Initiatives into a Great Whole,”: keynote address to the Joint Conference on 
Digital Libraries, Denver, Colorado (8 June 2005); and Ben Williams, lead 
librarian at the Field Museum, quoted in James Gorman, “In Virtual 
Museums, An Archive of the World,” New York Times, 12 Jan. 2003. 


As the Internet becomes home to more of our cultural heritage, the issues of 
access, Management, and preservation become ever more critical. In their 
study “How Much Information,” Peter Lyman and Hal R. Varian have 
tracked the steadily increasing amounts of information produced each year, 
in all media. In 2003, analyzing chiefly 2002 data, they estimated 
production of 300 terabytes (TB) of print, 25TB of movies, 375,000TB of 
digital photography, 987TB of radio, 8,000TB of television, 58TB of audio 
CDs—and their estimates do not include software (such as video games) or 
materials originally produced for the Web, or more ephemeral forms of 


digital information such as phone calls or instant messaging. [footnote]A 
Wall Street Journal article in late 2005 described the effort that the National 
Archives and Records Administration is making to manage the digital 
output of the federal government: from President George W. Bush’s 
administration, the expected volume of e-mail alone is estimated to be more 
than 100 million messages. [footnote] 

Peter Lyman, and Hal R. Varian, "How Much Information" (2003) 
http://www.sims.berkeley.edu/how-much-info-2003. 

Anne Marie Squeo, “Oh, Has Uncle Sam Got Mail: As Digital Documents 
Pile Up, The National Archives Worries about Technical Obsolescence.” 
Wall Street Journal, 29 Dec. 2005. 


The challenge is indeed grand in scale; hence, now is the time for ambitious 
thinking about what advances in information technology and 
communications networks have to offer the humanities and social sciences, 
and, in turn, and how such advances can ultimately serve the public. 


Decades of Accelerating Change 


Decades of Accelerating Change 


The recent transition to an Internet culture is documented by a series of 
surveys and reports by the American Council of Learned Societies (ACLS) 
and the Research Libraries Group (RLG). In the mid-1980s, the ACLS 
surveyed almost four thousand scholars in the humanities and social 
sciences to learn what they “think about a wide range of issues of greatest 
concern to their careers, their disciplines, and higher education in general.” 
The survey’s first finding was the “rapid increase in computer use.” “In 
1980,” the report notes, only “about 2 percent of all respondents either 
owned a computer or had one on loan for their exclusive use.” But by 1985, 
it observes with obvious excitement, “the number was 45 percent, most of 
whom used it not only for routine word processing but for other purposes as 
well.” Those “other purposes” were, however, clearly minority pursuits. 
Only about one in five scholars reported using online library catalogs or 
databases; only one in ten used e-mail; just 7 percent (most of them in 
classics or linguistics) said that they had used a computer for “theme, text, 
semantic, or language analysis.” [footnote | 

Herbert Charles Morton, Anne J. Price, and Robert Cameron Mitchell, The 
ACLS Survey of Scholars: Final Report of Views on Publications, 
Computers, and Libraries (Lanham, MD: University Press of America, 
1989). 


In 1988 RLG published a detailed assessment of information needs in the 
humanities and social sciences. [footnote]The responses of the humanists 
interviewed were consistent across disciplines: they wanted more machine- 
readable catalogs, indexes, and other finding aids. There was little interest 
in making full texts available in digital form, partly because the technology 
was new and untested, but also because scholars were accustomed to the 
informal, book-based, and often serendipitous browsing methods of 
research that had been fundamental to humanities scholarship for centuries. 
Image databases for two- and three-dimensional objects were largely 
beyond the capacities of the technology— and the budgets—of the time. 
Constance Gould, Information Needs in the Humanities: An Assessment 
(Mountain View, CA: Research Libraries Group, 1988). 


The RLG report showed the social sciences to be more dependent on 
technology than were the humanities; almost every social science discipline 
in 1988 had a trusted machine-readable index associated with scholarship 
and research in the relevant academic fields. The social sciences were 
interested in the availability of electronic databases and datasets for 
research support; for example, the census and Inter-University Consortium 
for Political and Social Research (ICPSR) materials were already well 
established in several disciplines. Scholars in the social sciences also 
expressed interest in using technology to improve access to conference 
papers, unpublished research, and technical reports. 


In 1997 the ACLS issued a study focusing on information technology in the 
humanities. [footnote |Published fewer than ten years after the RLG report, 
it revealed greater acceptance of technology in the humanities, greater 
technical knowledge, and a belief that information technology could enrich 
and influence research. Its chief recommendations included a call for a 
national strategy for digitizing texts, images, sound, and other media 
pertinent to the cultural heritage as well as for coordinated large-scale 
projects to effect this digitization; more pervasive technical standards; 
greater attention to the challenges of preservation of digital information 
over time; and a need to promote within the universities a more hospitable 
environment for computer-supported arts and humanities. 

Pamela Pavliscak, Seamus Ross, and Charles Henry, “Information 
Technology in Humanities Scholarship: Achievements, Prospects, and 
Challenges—The United States Focus”. (New York: American Council of 
Learned Societies, 1997). ACLS Occasional Paper No. 37 
http://www.acls.org/op37.htm. 


The findings and recommendations of the 1988 RLG report seemed almost 
quaint to those scholars interviewed less than a decade later, underscoring 
revolutionary advances in information technology now taken for granted. 
Almost every scholar regards a computer as basic equipment. Information is 
increasingly created and delivered in electronic form. E-mail and instant 
messaging have broadened circles of communication and increased the 
amount and, arguably, the quality of debate among dispersed scholarly 
communities. These changes were the result of the availability and 
usefulness of first-generation cyberinfrastructure. 


Networked access to information sources in the humanities and social 
sciences has increased dramatically in recent years, largely because of the 
widespread adoption of the Web as a kind of first-generation, all-purpose 
cyberinfrastructure. Through the Web, Project MUSE [footnote ]offers more 
than 250 online, full-text contemporary journals in the humanities, arts, and 
social sciences. The journals can be searched by keywords, and the reader 
can follow links to relevant footnotes and other related journal articles. 
JSTOR [footnote ](an abbreviated designation for Journal Storage) is a large 
archive of older publications, some extending back a hundred years. 
Currently JSTOR contains 614 journals from 375 publishers, with more 
than fourteen million pages. Another project, ARTStor, [footnote |modeled 
on JSTOR, focuses on art images drawn from many time periods and 
cultures. ARTStor holds hundreds of thousands of images contributed by 
museums, archeological teams, and photo archives, as well as tools and 
indexes that facilitate productive use of this vast collection. InteLex Past 
Masters [footnote ]is a large dataset of full texts, usually in the form of 
complete works of major thinkers in the social sciences—particularly 
economics, political thought and theory, and sociology. Social scientists and 
students often turn to this Web site for trusted editions of, for example, 
Charles Darwin, Herbert Spencer, or Adam Smith. For authors who wrote 
in languages other than English, an English translation is provided. 
Cogprints [footnote ]is often the first place scholars go for information 
pertinent to the study of cognition: psychology, anthropology, and other 
social sciences that include elements of cognitive study are represented by a 
wealth of digitized research. 

Johns Hopkins University http://muse.jhu.edu/. 

http://www. jstor.org/, 

http://www.artstor.org/. 

http://library.nlx.com/. 

Cognitive Sciences Eprint Archive http://cogprints.org/. 


Cultural Infrastructure and the Public 


Cultural Infrastructure and the Public 


In 1990 the World Wide Web was just an idea—or, more specifically, a 
proposal entitled “Information Management” [footnote |being circulated by 
Tim Berners-Lee at CERN (Conseil Européen pour la recherché 
nucléaire/European Organization for Nuclear Research). In 1993 there were 
two hundred known Web servers. [footnote|Ten years later, in 2003, there 
were forty million servers, and in 2006, that number has doubled to more 
than eighty million servers hosting billions of Web pages. [footnote |For 
many people, access to the Internet and its resources is now indispensable, 
but it is more than a place where people shop, seek information, or find 
entertainment. According to the Pew Internet & American Life Project 
study, [footnote |the Internet “creates new online town squares” and 
“enhances the relationship of citizens to their government.” 

Tim Berners-Lee http://www.w3.org/History/1989/proposal.html. 
http://www.w3.org/History.html. 

Netcraft, “April 2006 Web Server Survey” 
http://news.netcraft.com/archives/2006/04/06/april_ 2006 web_server_surv 
ey.html. 

Pew Internet & American Life Project, “Internet: The Mainstreaming of 


Putting the historical record online opens it to people who rarely have had 
such access it before. For example, the Library of Congress allows high- 
school students into its reading rooms only under special circumstances, but 
any student may enter its American Memory site [footnote ]to view the 
virtual archive on the same terms of access as the most senior historian or 
member of Congress. If digitized properly, many online texts and images 
are accessible to those with visual impairments or other disabilities through 
screen readers and other supportive technologies. 

Library of Congress, American Memory 
http://memory.loc.gov/ammem/index.html. 


Digital collections also allow for juxtapositions of works that are held in 
disparate physical collections. For example, the William Blake Archive 


[footnote|not only makes the works of Blake available to the general public 
but also allows users to juxtapose and compare works that are physically 
housed in libraries, museums, and art galleries around the world. 

Library of Congress http://www.blakearchive.org/blake/. 


This remarkable connectivity has brought scholars into broader 
communication with nonscholarly audiences, as well. Humanists and social 
scientists now routinely hear from students and members of the general 
public who have found their e-mail addresses and have questions. Scholars 
who have created Web sites based on their work are often pleasantly 
surprised that their work has found entirely new audiences—or, rather, that 
new audiences have found that work. Nonacademic users of the University 
of North Carolina’s archival Web site Documenting the American South 
[footnote|speak eloquently of feeling “privileged to have access to these 
primary sources, as if they had entered an inner sanctum where they did not 
fully belong,” reports former university librarian Joe Hewitt. 
http://docsouth.unc.edu/. 


Still, access is far from universal. Those who use freely accessible resources 
will find materials published before World War I more plentiful than newer 
materials, owing to copyright limitations. Scholars and members of the 
public who are not affiliated with research universities will find that access 
to a significant number of resources is by subscription only, and that 
subscription is priced at a level that only institutions can afford. One 
independent scholar of history and respondent to a survey on use of digital 
resources (conducted in the course of the Commission’s work by the Center 
for History and New Media), speaks for many when she says: 


"Tam an independent scholar [and] so do not have the kind of access to 
facilities that academics do. A research associateship at the Five College 
Women's Studies Research Center allows me the access via Mount Holyoke 
College, [but] only during the term of the association. So yes, there are 
problems for those of us not attached to a subscribing institution." 


In addition to digitizing materials, projects to collect and preserve born- 
digital content are critically important. In 1994, for example, film director 
Steven Spielberg established Survivors of the Shoah Foundation, with a 
mission to videotape and preserve the testimonies of Holocaust survivors 


and witnesses. Today the USC Shoah Foundation Institute’s Visual History 
Archive [footnote]at the University of Southern California has collected 
more than fifty-two thousand eyewitness testimonies in fifty-six countries 
and thirty-two languages, all of which are extensively indexed so that 
sophisticated searching in the archive can be easily conducted by anyone 
via the Internet. In 1996 The Internet Archive [footnote}was founded with 
the purpose of offering permanent access for researchers, historians, and 
scholars to historical collections that exist in digital format. 
http://www.usc.edu/schools/college/vhi/. 


Seeing in New Ways 


Seeing in New Ways 


Evolving technologies not only provide unprecedented access to a variety 
of cultural artifacts but also make it possible to see these artifacts in 
completely new ways. Thanks to high-end digital imaging, we can examine 
and compare ancient cuneiform inscriptions with new precision and clarity. 
[footnote|We can see the much-damaged manuscript of Beowulf in a way 
that renders the text more legible than the original, and we can “peel back” 
successive conservation treatments to see how the varying states of the 
artifact over time have influenced interpretation. [footnote ]Other ambitious 
and comprehensive editing projects reproduce the complex genealogy of a 
medieval text [footnote ]or recreate the many sources and states of the works 
produced across an entire lifetime by an influential nineteenth-century 
author working in the age of print. [footnote |Three-dimensional modeling 
makes it possible to recreate Roman forums, [footnote |medieval cathedrals, 
[footnote]and Victorian exhibitions. [footnote |These models may provide 
more than just a sense of place for the user—in the process of building the 
model, scholars often learn surprising new things about how the originals 
must have been constructed. 

University of California, Los Angeles, and Max Planck Institute, Cuneiform 
Digital Library Initiative (2005) http://cdli.ucla.edu/; InscriptiFact and 
University of Southern California, West Semitic Research (2004) 
http://www.inscriptifact.com/. 

British Library, The Electronic Beowulf (2003) 
http://www.uky.edu/~kiernan/eBeowulf/guide.htm. 

University of Virginia, The Piers Plowman Electronic Archive (2005) 
http://jefferson.village.virginia.edu/seenet/piers/. 

University of Virginia, Institute for Advanced Technology in the 
Humanities, The Rossetti Archive (2005) http://www.rossettiarchive.org/. 
University of California, Los Angeles, Cultural Virtual Reality Lab (2005) 
http://www.cvrlab.org/. 

University of Virginia, Institute for Advanced Technology in the 
Humanities, Salisbury Project, Cathedral Model (2005) 
http://www3.iath.virginia.edu/salisbury/model/index.html. 


University of Virginia, Institute for Advanced Technology in the 
Humanities, The Crystal Palace (2005) 
http://www.iath. virginia.edu/london/model/. 


Digital video reformats fragile film and thus gives us access to rare footage 
of dance performances from the early decades of the last century. 
[footnote|Mapping technology allows us to understand the rapid spread of 
religious hysteria in the Massachusetts Bay Colony during the seventeenth 
century [footnote ]or to observe the evolution of the built and natural 
environment around Boston’s Back Bay over two centuries. [footnote |The 
Valley of the Shadow project contains extensive records in the form of 
digitized diaries, letters, newspapers, statistical records, and photographs 
and other images of the period leading up to and following the Civil War; it 
also has animated maps of battles that visually reconstruct troop 
movements, points of battle engagement, and other data drawn from army 
and navy records of the time. [footnote | 

See, e.g., the Library of Congress’s American Memory site’s List of Variety 
Stage Films http://(www.memory.loc.gov/ammem/vshtml/vsfmlst.html. 
University of Virginia, The Salem Witch Trials (2005) 
http://etext.virginia.edu/salem/witchcraft/home.html. 

University of Virginia, Institute for Advanced Technology in the 
Humanities, Evolutionary Infrastructure (2005) 
http://www3.iath.virginia.edu/backbay/. 

University of Virginia, The Valley of the Shadow (2005) 
http://valley.vcdh.virginia.edu/. 


These and other digital projects show how digital technology can offer us 
new ways of seeing art, new ways of bearing witness to history, new ways 
of hearing and remembering human languages, new ways of reading texts, 
ancient and modern. With some extension, the same infrastructure used for 
such projects can also allow us to work in collaboration with distant 
colleagues who provide complementary expertise, and whom we may meet 
face-to-face only rarely. And all of this is about access: access to 
colleagues; or access through digital representations to distant, damaged, or 
disappeared physical artifacts; or intellectual access to the meaning or 
significance of these artifacts. 


Working in New Ways 


Working in New Ways 


In the last decade, users of the Web have gained unprecedented access to 
pre—twentieth-century cultural materials, but the real promise of our digital 
collections has yet to be realized. There is still a long way to go before we 
achieve even basic access to primary sources that will allow scholars and 
public researchers to work in new ways. A survey of special collections that 
was conducted by the Association of Research Libraries in 1998 found that 
the uncataloged backlog of manuscript collections represented one-third of 
repository holdings. A similar survey conducted in 2003-2004 showed that 
34% of archives and manuscript repositories have at least half of their 
holdings unprocessed; 60% have at least one-third of their collections 
unprocessed. [footnote |“Unprocessed” and “uncataloged” mean that no 
online catalog entries exist, nor are there in-house catalogs, indexes, or 
finding aids. 

Mark A. Greene and Dennis Meissner, “More Product, Less Process: 
Revamping Traditional Archival Processing,” American Archivist 68 
(Fall/Winter 2005): 208-63. 


Users of these massive aggregations of text, image, video, sound, and 
metadata will want tools that support and enable discovery, visualization, 
and analysis of patterns; tools that facilitate collaboration; an infrastructure 
for authorship that supports remixing, recontextualization, and commentary 
—in sum, tools that turn access into insight and interpretation. Examples 
might include humanities text-mining (discussed more specifically below), 
as in the Nora project, [footnote]or works of seemingly more traditional 
scholarship that rely on digital tools, such as Ed Ayers’s book In the 
Presence of Mine Enemies (Norton, 2003), which unfolds a tale of the daily 
life of ordinary people during the Civil War that could not have been 
researched and developed without access to the gigabytes of digitized 
historical sources that constitute the Valley of the Shadow project. 
[footnote] 


University of Virginia http://valley.vcdh.virginia.edu/. 


If the promise of cyberinfrastructure is to be realized, humanists and social 
scientists must take the lead in directing the design and development of the 
tools their disciplines will use. We will require support systems for that 
development: research centers that are national repositories of expertise, 
postdoctoral programs that emphasize digital scholarship, and graduate 
programs that train the rising generation in the methods of digital research 
and scholarship. 


What will those tools, customized for the humanities and social sciences, 
do? A general answer to that question was offered to the Commission in its 
first public hearing by Michael Jensen, electronic publisher for the National 
Academies Press: “Human interpretation is the heart of the humanities. .. . 
devising computer-assisted ways for humans to interpret more effectively 
vast arrays of the human enterprise is the major challenge.” In practice, this 
means that tools for use with digital libraries will need to enable the user to 
find patterns of significance (heuristics) in very large collections of 
information, across many different types of data, and then interpret those 
patterns (hermeneutics). In the humanities and social sciences, heuristics 
and hermeneutics are core activities. 


In the world at large, the activity of discovering and interpreting patterns in 
large collections of digital information is called data-mining (or sometimes, 
when it is confined to text, text-mining), but data-mining is only one 
investigative method, or class of methods, that might become more useful 
in the humanities and the social sciences as we bring greater computing 
power to bear on larger and larger collections and more complex research 
questions, often with outcomes in areas other than that for which the data 
was originally collected. Beyond data-mining, there are many other ways of 
animating and exploring the integrated cultural record. They include 
simulations that reverse-engineer historical events to understand what 
caused them and how things might have turned out differently; game-play 
that allows us to tinker with the creation and reception of works of art; 
[footnote ]role-playing in social situations with autonomous agents, or using 
virtual worlds to understand behavior in the real world. [footnote] 

Applied Research in Patacriticism, IVANHOE (2005) 

http://www. patacriticism.org/ivanhoe/. 


See, e.g., Joshua Epstein, Generative Social Science: Studies in Agent- 
Based Computational Modeling (Princeton: Princeton University Press, 
2006), and Edward Castronova, Synthetic Worlds: The Business and 
Culture of Online Games (Chicago: University of Chicago Press, 2005). 


We can design the software tools, computer networks, digital libraries, 
archives, and museums that are needed to assemble, preserve, and examine 
the human record in all of its “variety, complexity, incomprehensibility, and 
intractability,” as Henry Brady, Professor of Political Science and Public 
Policy and Director of The Survey Research Center at the University of 
California, Berkeley, described it during his August 2004 testimony to the 
Commission. [footnote]But many barriers stand between us and a future in 
which we might realize something approaching the unification of the 
cultural record. Some of these barriers are technical, but the more 
formidable ones are human and societal—whether legal, organizational, 
disciplinary, political, or economic. Humanists and social scientists, being 
experts in human culture and social problems, should be well trained to 
address these challenges, but they will need to begin with their own 
organizations, disciplines, politics, and reward systems. The next chapter 
addresses these challenges. 


rady_summary, 


Ephemerality 


The Commission has identified six key challenges that must be engaged if 
we intend to build a robust cyberinfrastructure: 


e The ephemeral nature of digital data 

e The nature of humanities and social science data 

e Copyright laws 

e The conservative culture of scholarship 

e Uncertainty about the future mechanisms, forms, and economics of 
scholarly publishing and scholarly communication more generally 

e Insufficient resources, will, and leadership to build cyberinfrastructure 
for the humanities and social sciences 


Ephemerality 


The study of human cultures and creativity is founded on access to the 
records of the past. Preserving and ensuring the authenticity of the artifacts 
and records of the past is one of the most valued functions of libraries, 
archives, and museums—and yet we have only begun to learn how to do 
these things with the political, economic, social, and cultural record of our 
increasingly digital civilization. [footnote |Digital data are notoriously 
fragile, short-lived, and easy to manipulate without leaving obvious 
evidence of fraud. Therefore, such content is best preserved in trustworthy 
repositories, without which there will be critical breaks in the chain of 
evidence. Although sites such as YouTube, Flickr, Facebook, and MySpace 
[footnote|have become popular for hosting digital collections, they are not 
repositories that ensure long-term access to the content. The rapid turnover 
in digital hardware and software often leaves digital data marooned on 
media or in formats that can no longer be accessed and that are highly 
susceptible to deterioration and loss. Preservation requires the scrupulous 
management of data from the moment it enters a repository through the 
steps of validation, storage, migration, and delivery to parties that have 
been authenticated and authorized to receive it. These are complex technical 
procedures dependent on standards and protocols that work quickly and 
reliably. Preservation was once an obscure backroom operation of interest 
chiefly to conservators and archivists: it is now widely recognized as one of 


the most important elements of a functional and enduring 
cyberinfrastructure. 

For an overview of some of the preservation issues and literature, see 
Daniel J. Cohen and Roy Rosenzweig, “Preserving Digital History,” in 
Digital History: A Guide to Gathering, Preserving, and Presenting the Past 
on the Web (Philadelphia: University of Pennsylvania Press, 2005) 
http://chnm.gmu.edu/digitalhistory/preserving/. 

YouTube http://www. youtube.com/; Flickr http://www.flickr.com/; 
Facebook http://www.facebook.com/; MySpace http://www.myspace.com/. 


The Nature of Humanities and Social Science Data 


The Nature of Humanities and Social Science Data 


Digitizing the products of human culture and society poses intrinsic 
problems of complexity and scale. The complexity of the record of human 
cultures—a record that is multilingual, historically specific, geographically 
dispersed, and often highly ambiguous in meaning—makes digitization 
difficult and expensive. Moreover, a critical mass of information is often 
necessary for understanding both the context and the specifics of an artifact 
or event, and this may include large collections of multimedia content: 
images, text, moving images, audio. Humanities scholars are often 
concerned with how meaning is created, communicated, manipulated, and 
perceived. Recent trends in scholarship have broadened the sense of what 
falls within a given academic discipline: for example, scholars who in the 
past might have worked only with texts now turn to architecture and urban 
planning, art, music, video games, film and television, fashion illustrations, 
billboards, dance videos, graffiti, and blogs. 


The archive of the University of Southern California’s USC Shoah 
Foundation Institute for Visual History and Education [footnote ]is a good 
example of the value of critical mass or functional completeness. The tale 
of what happened to one or two families, in one or two villages, in one or 
two countries, during the Holocaust is worth recording and disseminating. 
But we can gain far more knowledge from the record of some fifty-two 
thousand testimonies. In history, art history, classics, or any other scholarly 
enterprise that benefits from a comprehensive comparative approach, 
quantity can become quality. 

http://www.usc.edu/schools/college/vhi/. 


The problems of digitizing cultural documents are multiplied when these 
documents have many audiences. Within the social sciences and 
humanities, there can be numerous subject specialists who want access to 
the same sources for different reasons. For example, the Roman de la Rose 
Project, a stunning digital collection of the major illuminated manuscripts 
of the Roman de la Rose, a popular medieval literary work, [footnote |is 
used by literary scholars, art historians, linguists, social historians, and 


preservation specialists, each of whom has a different disciplinary 
perspective and vocabulary. Students and the general public often use such 
documents as well, and since those audiences want further 
contextualization, the data or evidence itself needs to carry, within itself, 
more self-description and more cues about the context in which it belongs. 
Johns Hopkins University and the Pierpont Morgan Library, Roman de la 
Rose http://rose.mse. jhu.edu/. 


Copyright 


Copyright 


The framers of the U.S. Constitution sought to balance the rights of the 
creators of intellectual property and the claims of the larger community. 
Article 1, Section 8, grants Congress the power to give “authors and 
inventors the exclusive right to their respective writings and discoveries,” 
but it also specifies that such rights be granted only “for limited terms” and 
with the purpose of promoting “the progress of science and the useful arts.” 
Today, because of the scale of investment that is required in order to create 
a unified cultural record online, the participation of commercial entities is 
essential, and yet many people (including most of those from whom the 
Commission heard) believe that the balance has been upset and that the 
property claims of rights holders are interfering with the promotion of 
intellectual and educational progress. 


The most notable recent U.S. Supreme Court decision on copyright— 
Eldred v. Ashcroft (2003)—involved someone who was seeking to 
disseminate works in the humanities to a broad public. Eric Eldred was the 
organizer of the Eldritch Press Web site, [footnote |dedicated to providing, 
for free, works bynineteenth-century authors such as Nathaniel Hawthorne. 
Eldred had wanted to add to his Web site Robert Frost's poetry collection 
New Hampshire, which was slated to pass into the public domain in 1998, 
[footnote]but the Sonny Bono Copyright Term Extension Act of 1998 
(CTEA) halted his plans. Eldred sued to overturn CTEA on the grounds that 
its twenty-year extension subverted the constitutional provision of “limited” 
copyright terms and did nothing to promote new creativity. Eldred’s case 
was heard and his argument was rejected by the Supreme Court. 
Unrestricted access to our cultural heritage in digital form currently ends in 
1923: all of Hawthorne is up on the Web, but most of F. Scott Fitzgerald is 
not. Copyright restrictions will limit the Library of Congress’s planned 
World Digital Library: because the project intends to digitize only material 
in the public domain, it will have to exclude the great majority of cultural 
works of the twentieth century. 

http://www. ibiblio.org/eldritch/. 


See http://www. legalaffairs.org/issues/March-A pril- 
2004/story_lessig_marapr04.msp. 


Obtaining permission to digitize books, even if they are out of print, entails 
high transaction costs: it can be difficult, if not impossible, to locate the 
current owners of copyrighted works. In a study assessing the feasibility of 
obtaining permission from 209 publishers to digitize 277 titles published 
between 1920 and 2000, librarians at Carnegie Mellon University found 
that a quarter of the publishers could not be located, only half of the 
publishers responded after repeated efforts to contact them, and, in the end, 
permission was granted for only 25% of the titles. [footnote] 

Denise Troll Covey, Acquiring Copyright Permission To Digitize and 
Provide Open Access to Books, October 2005, Digital Library Federation 
and Council on Library and Information Resources. Persistent URL 


It is equally frustrating that many lesser-known creative and cultural works 
—not just books, but also photographs, drawings, films, and other materials 
—from the 1920s and later years cannot be made available online simply 
because the rights holders are difficult or impossible to find. Because recent 
copyright law has eliminated the requirement that rights holders formally 
apply for renewal, the copyrights of these so-called orphan works are 
automatically extended. Although such works often lack commercial value, 
the expense and difficulty of locating the rights holders blocks their 
digitization. Most institutions want to avoid the risk of litigation should 
rights holders surface after the works have been made broadly accessible. In 
January 2006 the U.S. Copyright Office issued a report [footnote ]on orphan 
works; hearings were held in the House and the Senate, and, as of this 
writing, it seems likely that legislation will be introduced to remedy this 
situation. 

U.S. Copyright Office http://www.copyright. gov/orphan/. 


Even more complex issues arise in providing access to unpublished works 
(manuscripts and letters, for example), a category of particular importance 
to the humanities. Many sound recordings, too, are effectively “protected” 
from being reproduced in the practice of scholarship until the latter half of 


the twenty-first century, when any scholar now engaged in research is likely 
to be dead. [footnote] 

Most sound recordings issued before 1972 are protected until 2067. Before 
1972, sound recordings were protected by varying state laws rather than by 
federal law. The 1976 Copyright Act exempted recorded sound from federal 
protection until 2047; this date was changed to 2067 with the passage of the 
1998 Sonny Bono Copyright Term Extension Act. The implications of these 
protections for preservation are explored in a recent report by June M. 
Besek, Copyright Issues Relevant to Digital Preservation and Dissemination 
of Pre-1972 Commercial Sound Recordings by Libraries and Archives, 
December 2005, Council on Library and Information Resources and 


Current copyright laws not only keep most twentieth-century works from 
becoming available in digital form but also threaten the preservation of 
born-digital works. Although the copyright code currently has several 
important provisions that enable libraries and archives to make copies for 
preservation, these provisions are threatened by the transition to digital 
distribution. Section 108 of the copyright code is one such provision. It 
allows libraries and archives to duplicate works under copyright (in 
quantities specified by case law) to preserve their intellectual content. This 
provision covers the right of libraries and archives to copy works from one 
medium to another, such as brittle paper to microfilm or nitrate film to 
safety stock, and permits copying to digital form for preservation purposes 
(not for access). Yet it is not clear that all the forms of copying needed for 
secure digital archiving are allowable under the law. 


The provisions of Section 108, created for the world of print, need to be 
recast for the age of digital replication. As the 1998 Digital Millennium 
Copyright Act (DMCA) demonstrates, when recasting copyright law, it is 
important to consider unintended consequences. The DMCA lacks all of the 
fair use provisions outlined in Section 107 of the Copyright Act 

[footnote ]and criminalizes all efforts to circumvent devices that prevent 
duplication of digital materials, including efforts made to copy electronic 
materials for preservation. Without such an exception, the preservation of 
published electronic materials is seriously jeopardized, and the problem is 
bound to escalate as more and more content is distributed digitally. The 


DMCA has also eroded the ability of public libraries, and, indeed, of any 
library that is not exceptionally well funded, to serve its patrons in a digital 
age, while putting at risk many digital projects such as those described 
earlier. In other words, we could become much worse off than we have 
been, historically, simply because existing law thwarts a reliable and cost- 
effective means to preserve cultural content as a public service. [footnote] 
Section 107 lists the purposes for which the reproduction of a particular 
work may be considered “fair,” such as criticism, comment, news reporting, 
teaching, scholarship, and research. For a discussion of fair use, see 
Marjorie Heins and Tricia Beckles, Will Fair Use Survive? Free Expression 
in the Age of Copyright Control, 2005, Brennan Center for Justice at the 
New York University School of Law 


For a concrete example of the effects that legal issues have on archiving 
efforts, see Jeff Ubois, “New Approaches to Television Archiving,” First 
Monday 10.3 (March 2005) 
http://firstmonday.org/issues/issuel0_3/ubois/index.html. 


The Conservative Culture of Scholarship 


The Conservative Culture of Scholarship 


In response to the Commission’s invitation for public comment on the draft 
of this report, Dickie Selfe (director of Michigan Technological University’s 
Center for Computer-Assisted Language Instruction) observed that the 
“challenge of cyberinfrastructure is primarily a challenge to our own 
academic cultures. This report is an opportunity to admit to that challenge 
and to commit to cultural change.” The university is an ancient institution, 
so it is not surprising that its culture is conservative, especially in the 
humanities—one of the oldest faculties of the university. Robert Darnton, a 
prominent scholar of French history, remarked at the Commission hearings 
that the structural elements of the academy have not changed, even though 
the world has. A recent study of the state of online American literary 
scholarship identified several cultural features among humanists that seem 
to militate against change. [footnote]Despite the demonstrated value of 
collaboration in the sciences, there are relatively few formal digital 
communities and relatively few institutional platforms for online 
collaboration in the humanities. In these disciplines, single-author work 
continues to dominate. Lone scholars, the report remarked, are working in 
relative isolation, building their own content and tools, struggling with their 
own intellectual property issues, and creating their own archiving solutions. 
Martha Brogan, A Kaleidoscope of Digital American Literature 
(Washington, DC: Digital Library Federation and Council on Library and 
Information Resources, 2005). 


Many have contrasted this pattern to that found among technology-intensive 
sciences and engineering, in which “large, multidisciplinary teams of 
researchers” work “in experimental development of large-scale, engineered 
systems. The problems they address cannot be done on a small scale, for it 
is scale and heterogeneity that makes them both useful and interesting.” 
[footnote |In contrast to this collaborative model, Stephen Brier, Vice 
President for Information Technology and External Programs of the City 
University of New York, told the Commission, “Humanists tend to be more 
focused on individual theorizing and communicating of ideas and 
information about their disciplines. Technology is not seen as a necessary, 


let alone a required, tool for collaboration in the humanities the way it is in 
the sciences.” 
(Chatham, 11) 


Most people the Commission interviewed expressed hope that an 
investment in cyberinfrastructure would allow humanists and social 
scientists to “conduct new types of research in new ways.” To take 
advantage of the technology, one must engage directly with it, and one must 
allow traditions of practice to be flexibly influenced by it. One such 
tradition in the humanities is that of the “individual genius.” Nevertheless, 
many of the examples cited in this report show us that humanists can be 
highly collaborative and that by working in groups, they can sometimes 
address research questions of greater scope, scale, and complexity than any 
individual—even a brilliant one—could address in isolation. 


Culture, Value, and Communication 


Culture, Value, and Communication 


The European Commission’s Web site Knowledge Society [footnote |posits 
that: 


"Our society is now defined as the “Information Society”, a society in 
which low-cost information and ICT [Information and Communication 
Technology] are in general use, or as the “Knowledge (-based) Society”, to 
stress the fact that the most valuable asset is investment in intangible, 
human and social capital and that the key factors are knowledge and 
creativity. This new society presents great opportunities: it can mean new 
employment possibilities, more fulfilling jobs, new tools for education and 
training, easier access to public services, increased inclusion of 
disadvantaged people or regions." 


One of the strategic goals set for Europe by the European Council is “to 
become the most competitive and dynamic knowledge-based economy in 
the world” by 2010. Clearly, other developed nations understand that 
economic growth is a function of knowledge and creativity, and that 
information is increasingly the core asset held by companies, the key social 
good produced by governments, and the determining factor in individual 
quality of life. 


A key component of the knowledge society is education, and education 
requires preservation of the record of the past as well as ongoing 
scholarship and research. Education, scholarship, and research all require 
the sharing of data and the communication of results. The system of 
scholarly communication includes scholars, publishers, libraries, and 
readers. Readers receive work that is produced by scholars using resources 
made available by publishers and held in or found through libraries. 
Scholars create value by doing research, thinking, and writing. Publishers 
add value through peer review, editing, and design. Libraries add value by 
collecting, organizing, and preserving scholarship, and, of course, by 
making it accessible. At least three economies are at work in this system: 


1. A prestige economy, primary for scholars and important but secondary 
for the other players 

2. A market economy, primary for publishers, usually not very important 
to scholars, and important but not primary for libraries 

3. A subsidy economy, primary for libraries, which are subsidized by 
universities, less available to publishers than it used to be, and more 
important to scholars than they generally know 


It should be no surprise that a system that comprises three different 
economies is difficult to operate successfully. When it does work, it has a 
certain elegance: each party contributes from its own sense of mission, and 
each gets paid in its own currency. The system has not always worked this 
way, though, and it may not continue to work this way much longer: at 
present, there seems to be general agreement that the system is broken, or 
breaking. [footnote] 

For an in-depth look at the pressures faced in one part of the system, by 
scholarly publishers, see John B. Thompson, Books in the Digital Age 
(Cambridge: Polity Press, 2005). Concerning the pressures faced by 
scholars, the Modern Language Association (MLA) has appointed a Task 
Force on Evaluation of Scholarship for Tenure and Promotion, which will 
complete its work this year and is expected to address how the tensions 
within the scholarly communication system are affecting junior faculty: see 
http://www.insidehighered.com/news/2005/12/30/tenurefor summary. 
information. For a library perspective, see the series of reports collected 
under the heading “Managing Economic Challenges” at the Council on 
Library and Information Resources 
http://www.clir.org/pubs/reports/managing html, or OCLC Online 
Computer Library Center, Environmental Scan: Pattern Recognition (2003) 
http://www.oclc.org/reports/escan/. 


Scholarship cannot exist without a system of scholarly communication: the 
cost of that system is a necessary cost of doing academic business. One 
could say that every part of this system is subsidized—from faculty to 
presses to libraries—and one could equally well say that every part operates 
under significant financial constraints. In the case of university-based 
publishers, institutional subsidy has declined in recent years, forcing 
university presses to behave more like commercial entities. [footnote |If, 


however, we take a longer view of the information life cycle in universities, 
revenue from sales may not be the best measure of the value of scholarship. 
It may make more sense to conceive of scholarly communication as a 
public good than as a marketable commodity. 

According to Peter Givler’s “University Press Publishing in the United 

in Scholarly Publishing, ed. Richard E. Abel, Lyman W. Newlin, and Katina 
Stauch [New York: Wiley 2002]), From 1988 to 1998, the average parent 
institution support among reporting presses declined from 10.4 percent of 
net sales to 6.3 percent, for a loss of 4.1 percent; during the same period, 
outside gifts and grants increased, as a percentage of net sales, by only 1.6 
percent, for a net loss in non-publishing income of 2.5 percent. 


The phrase “public good” often refers to the idea that there are good things 
—things of special social value—that ought to be produced for free public 
use rather than as a marketable commodity. [footnote |Common examples of 
public goods are national defense, vaccination programs, the GPS 
navigation system, dams, and public art. Education is often spoken of in 
these terms, and although education is to some extent exclusive (or there 
would not be systems of limited admissions), knowledge itself—as 
represented in scholarship and research—is not. Thomas Jefferson put it 
most eloquently: “He who receives an idea from me, receives instruction 
himself without lessening mine; as he who lights his taper at mine, receives 
light without darkening me.” [footnote |Private goods are a clear contrast to 
this: if one person eats an apple, a second person cannot eat the same apple; 
but one person can teach another how to spell apple without thereby losing 
that knowledge. In the case of public goods, charging a price invariably 
reduces social welfare relative to what is possible. 

There is also an economic construct—not unrelated, but not the same— 
called a “pure public good.” This more abstract concept derives from the 
production and use of a good, and it is worth noting that pure public goods 
(for example, air pollution) may not always be good things. The defining 
characteristic of a pure public good is that one can add more consumers 
without diminishing the quantity of the good available to others. National 
defense, the system of contract law (as distinct from litigation itself), 
standards, and information are all examples of pure public goods.If, for the 
pure public good, the cost of adding another consumer approaches zero, 


then it follows as a matter of economic efficiency that the market price 
ought to be zero, because to charge something for an item that costs nothing 
to produce at the margin is to pass up possible value—the value of making 
someone better off while doing no harm. 

Thomas Jefferson, “To Isaac McPherson,” 13 Aug. 1813, in Writings of 
Thomas Jefferson, ed. H. A. Washington, vol. 6 (Washington, DC: Taylor & 
Maury, 1853-1854) 180-81. 


On the other hand, although public goods can be extended to more users at 
or near zero cost, they can be quite costly to produce in the first place. The 
case of digitally produced scholarship is an excellent example. Economic 
theory tells us is that we ought to charge nothing for it at the margin: we 
ought to give it away. On the other hand, it tells us nothing about how to 
pay for its production or how much of it to produce. It does tell us that 
markets will underproduce this kind of good, though, and it also tells us 
that, as a general matter, the solution of public-goods problems requires 
collective action. 


Collectively, then, we should act to support the system of scholarly 
communication as a public good—and this collective action must be as 
broad as possible, including not only those universities with presses, but 
also all universities with faculty, libraries, students, and public outreach. 
After all, the social value produced by the system as a whole is enjoyed by 
all of these constituents. 


In considering how best to organize the publishing side of scholarly 
communication, it will also be important to be open to new business 
models. Received opinion and settled assumptions may be very costly, both 
in terms of missed opportunities and in terms of unforeseen expenses. For 
example, defying conventional wisdom, the National Academy Press has 
for some time now been distributing the content of its monographs free on 
the Web, and (thanks in part to a carefully thought-out strategy for doing 
that) it has seen its sales of print increase dramatically. 


By comparison with print, born-digital scholarship will be expensive for 
publishers to create and, over time, even more expensive for libraries to 
maintain. Even considering these costs, however, owning and maintaining 
digital collections locally or consortially, rather than renting access to them 


from commercial publishers, is likely to be a cost-cutting strategy in the 
long run. If universities do not own the content they produce—if they do 
not collect it, hold it, and preserve it—then commercial interests will 
certainly step in to do the job, and they will do it on the basis of market 
demand rather than as a public good. If universities do collect, preserve, and 
provide open access to the content they produce, and if everyone in the 
system of scholarly communication understands that the goods being 
produced and shared are in fact public goods and not private property, the 
remaining challenge will be to determine how much, and what, to produce. 


Such questions would normally be answered with reference to demand, and, 
indeed, one analysis of the “crisis in scholarly publishing” is that it is a 
crisis of audience. Average university-press print runs are now in the low 
hundreds, and although digital printing lowers the unit cost for printing 
short runs of books, selling fewer books raises the cost per copy to the 
library or scholar and makes it harder for the publisher to cover pre-press 
costs, which are still the most significant portion of the total cost of 
producing a book or article. On the other hand, university presses could 
(and should) expand the audience for humanities scholarship by making it 
more readily available online. Unless this public good can easily be found 
by the public—by readers outside the university—demand is certain to be 
underestimated and undersupplied. 


We note that some university presses have already made great strides in 
electronic publishing—Johns Hopkins’s Project MUSE, [footnote]Illinois’s 
History Cooperative, [footnote ]and the University of Virginia Press’s 
Rotunda [footnote |series, to name a few. The Rice University Press, closed 
in 1996, is being brought “back to life as the first fully digital university 
press in the United States.” [footnote]Some scholarly societies, such as the 
American Historical Association, also have experimented with publishing 
born-digital scholarship. These and other experiments in electronic 
publishing in the humanities and social sciences, and experiments in 
building and maintaining digital collections in libraries and institutional 
repositories, need to be supported as they move toward sustainability, and 
they need to be funded (by universities, by private foundations, and by the 
public) with the expectation that they will move toward open access—an 
area in which many of the natural sciences and some social sciences are 


conspicuously ahead of the humanities. [footnote]Open-source software is 
an instructive analogue here, and the experience in that community 
suggests, strongly, that one can build scalable and successful economic 
enterprises on the basis of free intellectual property. [footnote |It is worth 
noting, too, that the “Economy of Regard” (that is, prestige) is one of the 
factors used to explain why this open economy works. [footnote] 
http://muse.jhu.edu/. 


http://rotunda.upress.virginia.edu/. 

Rice University Press http://ricepress.rice.edu/. 

See John Willinsky, The Access Principle (Cambridge: Massachusetts 
Institute of Technology Press, 2005). 

See Bruce Perens, “The Emerging Economic Paradigm of Open Source” 
http://perens.com/Articles/Economic.html(2005). 

See Paul A. David and Rishab Aiyer Ghosh, “Free and Open Source 
Software Developers and ‘the Economy of Regard’: A Quantitative 
Analysis of Code-Signing Patterns within the Linux Kermel,” Stanford 
Institute for Economic Policy Research, SIEPR-Project NOSTRA Working 
Paper, 2004 


pen%20Source%20Software.html. 


As in the open-source community, [footnote |however, there are real 
resources in play, and those who contribute to them must have some 
motivation to do so. According to Kate Wittenberg, director of Electronic 
Publishing in Columbia (EPIC), such enterprises must “find a way in which 
the technical infrastructure and some aspects of workflow systems might be 
created centrally and then shared by a variety of projects in the humanities 
and social sciences.” She adds, “For EPIC and similar organizations, 
finding an answer to this challenge would be extremely valuable: [it would 
make] use of existing infrastructure to create efficiencies in organizations 
with minimal staffing.” [footnote]One model of shared infrastructure 
outside the United States is Erudit, an initiative of Les Presses de 
l’Université de Montréal. Erudit offers a range of services tailored to 
different kinds of academic publications and “is intended to serve as an 
innovative means of promoting and disseminating the results of university 
research.” [footnote|Another model might be a scaled-up version of EPIC 


itself, which is a collaboration among Columbia University’s press, 
libraries, and academic information systems. [footnote |The cooperation 
between the University of California Press and the California Digital 
Library is another promising example. 

See Jill Coffin, “An Analysis of Open Source Principles in Diverse 
Collaborative Communities,” First Monday 11.6 (June 2006) 
http://www.firstmonday.org/issues/issue11 _6/coffin/index.html. 


tenberg_ summary. 
Erudit http://www.erudit.org/en/index.html. 
EPIC http://www.epic.columbia.edu/. 


Resources 


Resources 


By any standard, investment in an American cyberinfrastructure is meager, 
as is U.S. research funding in general. [footnote]In 2003 the Atkins report 
recommended annual expenditures of $1 billion to create a 
cyberinfrastructure for science and engineering; in 2005 funding 
specifically designated to shared cyberinfrastructure at the National Science 
Foundation (NSF) was about $123 million. On a per capita basis, Australia, 
Canada, and the United Kingdom and other European countries have made 
proportionally much greater investments in developing a broadly accessible 
cyberinfrastructure than has the United States. The countries of the 
European Union arguably are far ahead of the United States, especially in 
the humanities and social sciences areas, given their recent investments in 
digital cultural heritage. [footnote | 

According to Vinton Cerf and Harris N. Miller in the Wall Street Journal 
(27 July 2005), “our total national spending on R&D is 2.7% of our GDP, 
and now ranks only sixth in the world. The federal government's share of 
total national R&D spending has fallen from 66% in 1964 to 25%” in 2005. 
See, e.g., these recent publications, which describe serious investment in 
humanities and social sciences cyberinfrastructure in the United Kingdom 
and the European Union: British Academy, E-resources for Research in the 
Humanities and Social Sciences—A British Academy Policy Review 
(2005) http://www.britac.ac.uk/reports/eresources/(20 May 2005).British 
Academy, Future Directions for Social Science: A Response from the 
British Academy (2004) http://www.britac.ac.uk/news/reports/esrc- 
0904/esrc0904-html.htm(20 May 2005).Guntram Geser and John Pereira, 
eds. (2004a). Resource Discovery Technologies for the Heritage Sector 
(Vol. 6): European Commission.Guntram Geser and John Pereira, eds. 
(2004b). Virtual Communities and Collaboration in the Heritage Sector 
(Vol. 5): European Commission.J. M. Jose (2004). Personalization 
techniques in information retrieval. Resource Discovery Technologies for 
the Heritage Secto, ed. Guntram Geser and John Pereira, European 
Commission. DigiCULT Thematic Issue 6.S. Ross, M. Donnelly, and M. 
Dobreva (2004). Emerging Technologies for the Cultural and Scientific 
Heritage Sector (Vol. 2): European Commission.S. Ross, M. Donnelly, M. 


Dobreva, D. Abbott, A. McHugh, and A. Rusbridge (2005). Core 
Technologies for the Cultural and Scientific Heritage Sector (Vol. 3): 
European Commission.British Academy, "That Full Complement of Riches': 
The Contribution of the Arts, Humanities, and Social Sciences to the 
Nation's Wealth (2004) 


Aug. 2005). 


One example of the kind of resource we need to develop here in the United 
States is the UK Data Archive, a “centre of expertise in data acquisition, 
preservation, dissemination and promotion and . . . curator of the largest 
collection of digital data in the social sciences and humanities in the UK.” 
The Data Archive is funded by the Economic and Social Research Council 
(ESRC), the Joint Information Systems Committee (JISC) of the Higher 
Education Funding Councils, and the University of Essex. [footnote] 

UK Data Archive, http://www.data-archive.ac.uk/about/about.asp. 


In the United States, the only similar institution is the Inter-University 
Consortium for Political and Social Research (ICPSR), established in 1962. 
There is no direct equivalent of the Arts and Humanities Data Service 
(AHDS), mentioned in the UK Data Archive description and founded in 
1996 as a “UK national service aiding the discovery, creation and 
preservation of digital resources in and for research, teaching and learning 
in the arts and humanities.” [footnote!The AHDS is jointly funded by JISC 
and the Arts and Humanities Research Council (AHRC), whose closest U.S. 
equivalent would be a combination of the National Endowment for the 
Humanities (NEH) and the National Endowment for the Arts (NEA). The 
AHRC has recently committed several years of new funding to the Methods 
Network to provide a “national forum for the exchange and dissemination 
of expertise in the use of Information and Communication Technologies 
(ICT) for arts and humanities research.” [footnote] 

Arts and Humanities Data Service http://www.ahds.ac.uk/, 

Methods Network http://www.methodsnetwork.ac.uk/. 


The lack of a similar coordinated effort in the United States is troubling, 
and even in the national context, support for humanities and social science 
research is dwarfed by other governmental spending commitments. Health 


research accounts for more than half of federal spending on basic 
(nondefense) research: the National Institutes of Health’s budget request in 
fiscal year 2006 was about $28.5 billion. The National Science Foundation 
budget, which provides some funding for the social sciences and almost 
none for the humanities, was $5.6 billion. Of that amount, about 10%, or 
$509 million, went to the Directorate for Computer and Information 
Science and Engineering (CISE), which until recently had the primary 
responsibility for cyberinfrastructure. (The CISE budget also funds NSF’s 
portfolio of basic research in the computer and information sciences and 
related areas.) The NSF now has an Office of Cyberinfrastructure, which 
will guide the agency's investments in cyberinfrastructure for science and 
engineering, funded at $123 million. Federal funding for humanities-related 
projects is tiny by comparison. The fiscal-year 2006 budget requests of the 
most important agencies—the National Endowment for the Humanities 
($138 million) and the Institute of Museum and Library Services ($247 
million)—combined equal less than the budget for CISE, which is itself 
only one-tenth of the NSF budget. And the ability of the NEA, NEH, and 
IMLS to fund cyberinfrastructure directly is diminished because much of 
the money in these agency budgets goes to states through block grants over 
which the agencies have little control. 


Private foundations are important sources of support in the humanities and 
the social sciences, but they cannot make up for the low level of federal 
funding. For example, no single private foundation in the United States— 
with the exception of the Bill & Melinda Gates Foundation, which 
primarily funds health initiatives—has annual funding that equals the 
budget of CISE. [footnote]Among the large private foundations, few are 
focused on humanities and social sciences. Nevertheless, philanthropic 
sources have so far played a disproportionately large role in funding the 
experimentation in digital projects in the humanities. Foundations such as 
the Andrew W. Mellon Foundation, the Getty Trust, the Carnegie 
Corporation, and the William and Flora Hewlett, David and Lucile Packard, 
and Alfred P. Sloan foundations have made strategic investments in 
building resources or seeding projects. There have also been remarkable 
instances of individual philanthropy from committed individuals, such as 
Brewster Kahle (the Internet Archive [footnote]), Rick Prelinger (Archive 
Films [footnote]), and David Rumsey (the David Rumsey Map Collection 


[footnote]), who not only collect high-value resources for the humanities 
and social sciences but also make them freely available on the Web. These 
are the Carnegies of the digital age, building digital libraries just as Andrew 
Carnegie built physical ones. 

The Foundation Center, “Foundation Growth and Giving Estimates” (2005) 
http://www.archive.org/. 

http://www.archive.org/details/prelinger. 

http://www.davidrumsey.com/. 


New federal funding is urgently needed for cyberinfrastructure in the 
humanities and social sciences and also for research and demonstration 
projects that explore new, sustainable business models for digital 
humanities and social science. Received wisdom on the limits of the market 
for ideas has been radically reoriented by the rise of networked 
communities, and, at this point, scholarly communication may well stand to 
lose more by failing to experiment than from experiments that fail. 
Universities need to connect with commercial information-technology 
innovators in order to understand these new information markets, 
experiment with business models, and think creatively about the value that 
is produced by research and teaching in the humanities and social sciences. 
In fact, corporate supporters and partners have played an important, often 
foundational, role at campus-based technology and media laboratories such 
as the Entertainment Technology Center at Carnegie Mellon; the School of 
Literature, Communication, and Culture at Georgia Tech; the Massachusetts 
Institute of Technology Media Lab; the Entertainment Technology Center at 
the University of Southern California; and the Institute for Advanced 
Technology in the Humanities at the University of Virginia. Commercial 
partners in these ventures may understand better than their academic 
counterparts how to communicate value to those who will pay for it, and 
academic institutions may understand better than their commercial 
counterparts how to ensure that value is not only circulated in the present 
but handed down in the future. There is a public interest even in privately 
held cultural materials, so it is inevitable that some difficult issues will arise 
where public and private meet; yet the creation of a robust 
cyberinfrastructure will require vigorous collaboration across this boundary. 


[footnote |If such bridges can be built and crossed, the resulting traffic will 
be good for education, good for business, and good for civic life. 

See Peter B. Kaufman, “Marketing Culture in the Digital Age: A Report on 
New Business Collaborations between Libraries, Museums, Archives and 
Commercial Companies (2005) 

http://www. intelligenttelevision.com/marketingculture.htm. 


Framework 


Framework 


In the years following the Civil War, the land grant universities transformed 
American higher education. After World War II, the GI Bill further 
propelled that transformation from an elitist educational system to one open 
to the public. The GI Bill itself created no institutions, nor did it mandate 
institutional behavior; but this direct means of distributing opportunity and 
resources dramatically expanded the number of people who considered 
college a possibility and prompted colleges and universities to see 
themselves as national, rather than local or regional, institutions. 
Established institutions that were responsive to the new opportunities, such 
as the University of California, flourished. 


When the federal government began the direct support of advanced 
research, the National Science Foundation (NSF), the National Institutes of 
Health, and, later, the National Endowment for the Humanities and the 
National Endowment for the Arts adopted the extramural grant mechanisms 
pioneered by philanthropic foundations. They combined these mechanisms 
with the peer-review practices developed within universities to distribute 
research support on the basis of competitive applications. The competitive 
“market” for research support reinforced standards of scholarly excellence 
and relied on the research ambitions of individual scholars to motivate the 
institutional response of universities in developing their local research 
infrastructures. 


The response of American higher education to the GI Bill, and the process 
developed by the federal government to fund advanced research, 
demonstrate that frameworks for action can challenge institutions to build 
upon existing capacities. This report suggests that cyberinfrastructure is 
another such framework for guiding decisions, allocating resources, and 
setting directions. Thinking about structures naturally requires also thinking 
about functions and their schematic relationship. That the NSF has already 
adopted cyberinfrastructure as such a framework underlines the need for 
strategic thinking. The cyberinfrastructure of the humanities and social 
sciences does not and will not exist independently of the larger academic 


infrastructure, where the sciences thus far have set priorities. Similarly, 
academic stakeholders must take account of the even larger social and 
commercial cyberinfrastructure that is, increasingly, the platform on which 
human creativity and social interaction—the subjects of the humanities and 
social sciences—is expressed and takes place. 


There follows a framework for action. First, we present five necessary 
characteristics of a robust cyberinfrastructure in the humanities and social 
sciences. Second, we identify eight actions that must be undertaken to make 
that infrastructure possible. 


Necessary Characteristics 


Necessary Characteristics 


An effective and trustworthy cyberinfrastructure for the humanities and 
social sciences will have the following characteristics: 


1. It will be accessible as a public good. 


We have argued that digital information has an inherently democratizing 
power—but that power can be unleashed only if access to the cultural 
record is as open as possible, in both intellectual and economic terms, to the 
public. On the one hand, the Web has made a great deal of human 
knowledge available for free: with its nine million items, the Library of 
Congress’s American Memory program is but one example. On the other 
hand, commercial entities have taken an increasingly prominent role both in 
digitizing public-domain cultural heritage and in digitizing cultural heritage 
materials still under copyright; these collections are often only available to 
organizations (such as major research libraries) able to pay substantial 
subscription or license fees. If public funds are involved in the creation of a 
digital resource, proportional elements of those resources should be freely 
available to the public. 


2. It will be sustainable. 


Sustainability is often thought of as primarily a financial issue: how will a 
project persist after start-up funding is spent? The digital transformation has 
raised questions about how to finance research, scholarly communication, 
and preservation that previously were obscured by the practices of libraries 
and university presses. Many humanists may have first encountered the 
concept of sustainability in discussions with potential funders of digital 
projects. As Diane M. Zorich noted in 2003, we need to avoid treating 
digital initiatives “as “special projects’ rather than as long-term programs.” 
[footnote]Although funding is critical to a program’s viability, sustainability 


goes beyond simply paying the bills: intellectual sustainability requires 
human capital. Digital projects need to draw on a pool of trained and 
engaged personnel, and therefore universities need to develop the programs 
and the opportunities that produce people with this kind of expertise. As 
Kevin Guthrie, the first director of JSTOR and now president of Ithaka, 
[footnote |remarked to the Commission, “individual experience is not 
scalable.” 

Diane M. Zorich, A Survey of Digital Cultural Heritage Initiatives and 
Their Sustainability Concerns (Washington, DC: Council on Library and 
Information Resources, 2003) 


http://www.ithaka.org/. 


3. It will provide interoperability. 


Access to data should be seamless across repositories. This will require 
standards-based tools and metadata that ensure interoperability and enable 
use for a variety of purposes. Cyberinfrastructure must be designed to be 
open, modular, and easily adaptable to new technologies so that the pursuit 
of interoperability does not become a source of delay and constraint. It must 
also be built to foster and support knowledge communities, which 
themselves must include information professionals who understand the 
standards issues. As NSF director Ardent L. Bement, Jr., observes, “with 
today’s electrical grid. .. my neighbor and I can use different appliances to 
meet our individual needs; as long as the appliances conform to certain 
electrical standards, they will work reliably,” and a sufficiently advanced 
cyberinfrastructure will work similarly: researchers will have “easy access 
to the computing, communication, and information resources they need, 
while pursuing different avenues of interest using different tools.” 
[footnote|In sum, cyberinfrastructure must serve geneticists and 
genealogists, historians of Buddhism and collectors of Delta blues, 
filmmakers and dancers, those in the academy, those working in business 
and industry, and those home-schooling their children. 

Ardent L. Bement, Jr., “From Concept to Confluence: Framing Our 
Cyberinfrastructure,” remarks, SBE/CISE Cyberinfrastructure Workshop 
(16 March 2005). 


A. It will facilitate collaboration. 


Digital technology favors openness and collaboration. Defining and 
building cyberinfrastructure should be a collaborative undertaking 
involving the humanities and social sciences communities in the broadest 
sense. It is equally important that the cyberinfrastructure be designed to 
foster and support collaboration across disciplinary and geographical 
boundaries and to bring new perspectives to bear on the exploration of the 
cultural record. Collaboration will be especially important as institutions of 
higher education seek to preserve and archive digital materials. Digital 
preservation will require leveraging talent, resources, and commitment in 
the academy, in the commercial sector, and in government. Each sector has 
already made significant contributions, each has a leadership role to play, 
and each needs to be further involved in the curation of our cultural 
heritage. 


5. It will support experimentation. 


Although cyberinfrastructure itself should be stable and reliable, it will need 
to support ongoing experimentation, and it will need to evolve. Researchers 
in the social sciences and humanities will need to experiment, and that 
experimentation will be crucial to bringing change to those disciplines. 
Institutions must encourage risk-taking by creating frameworks through 
which junior scholars and students are rewarded for ambitious research 
programs. Offering this encouragement means providing laboratories, 
postdoctoral grants, and other support that allows these research programs 
to be worked out and critically assessed. Institutions also need to allow their 
libraries and university presses to experiment and take chances in order to 
find more successful models of scholarly communication. It is important to 
foster a culture of experimentation by underwriting explicit mechanisms 
and traditions for capturing and sharing the lessons learned through 
innovation. True experimentation always carries with it the possibility of 
failure, as the necessary price for success, yet informative failures are 
essential to moving forward into the unknown, and they should be reported 
without prejudice and duly valued on that account. [footnote] 


John Unsworth, “The Importance of Failure,” The Journal of Electronic 


0Q2/unsworth.html. 


Recommendations 


Recommendations 


The necessary characteristics outlined above may be thought of as 
specifications for a humanities and social science cyberinfrastructure. 
Actually building something that answers to those specifications will 
require sustained effort and commitment in at least eight areas: 


1.Invest in cyberinfrastructure for the humanities and social sciences, 
as a matter of strategic priority. 


Addressed to: Universities; federal and private funding agencies 


Implementation: Determine the amount and efficacy of funding that now 
goes to support developing cyberinfrastructure for humanities and social 
sciences from all sources; through annual meetings and ongoing 
consultation, coordinate the goals this funding aims to achieve; and aim to 
increase both funding and coordination over the next five years, including 
commercial investments that are articulated with the educational 
community’s agenda. 


Senior scholars, research librarians, university leaders, state and national 
legislators, and members of the public interested in the cultural record 
should regard the development of the humanities and social science 
cyberinfrastructure as an essential strategic priority. Other countries already 
recognize this to be so. In European countries and in Canada and Australia, 
humanities and social science cyberinfrastructure is more generously 
funded (relative to the size of the population) than in the United States, and 
research frameworks integrate the support of humanities and social sciences 
with the support of science and engineering. 


In 2005 the British Academy issued an academic policy review in which the 
leading recommendation was that “relevant UK institutions and bodies 
adopt a coordinated and coherent strategic approach to e-resource provision 
and access, based on research community needs.” [footnote] 


British Academy, E-resources for Research in the Humanities and Social 
Sciences—A British Academy Policy Review (2005) 
http://www.britac.ac.uk/reports/eresources/(20 May 2005). 


The German e-Science Initiative was announced by the German Ministry 
for Research and Education (BMBF) in March 2004, coupled with a call for 
proposals in the areas of grid computing, e-learning, and knowledge 
management. The e-Science Initiative and D-Grid were launched on 
September 1, 2005. Currently, BMBF is funding over a hundred German 
research organizations with €100 million [$124 million] over the next five 
years. For the first three-year phase of D-Grid, the support is almost €20 
million [$25 million]. One of seven projects currently funded under this 
initiative is TextGrid, described as a “community grid for text-based 
disciplines.” [footnote ] 

See Federal Government of Germany, Federal Ministry of Education and 


In Australia $542 million Australian dollars ($405 million) is targeted for 
the National Collaborative Research Infrastructure Strategy, a major 
initiative under the Australian government’s “Backing Australia’s Ability— 
Building Our Future through Science and Innovation” program. This 
program “aims to bring greater strategic direction and coordination to 
national research infrastructure investments” while providing researchers 
with “access to major research facilities and the supporting infrastructure 
and networks necessary to undertake world-class research.” [footnote ]One 
of ten areas of emphasis in this program is “platforms for collaboration,” 
described in the strategic road map as aimed in part at the needs of the 
humanities and social sciences. [footnote | 
See Government of Australia, Department of Education, Science, and 
Training 

issues/ncris/default.htm. 
See Government of Australia, Department of Education, Science, and 
Training 


issues/ncris/documents/ncris_strategic_roadmap_pdf.htm. 


Investments in cyberinfrastructure are organized differently in each country, 
but from the point of view of this Commission, the salient fact is that they 
do include the humanities and social sciences. More importantly, the 
humanities and social sciences are a fully integrated part of the conversation 
and planning in these countries in a way that has not occurred in the United 
States. The United Kingdom, Germany, and Australia are only three of the 
nations gearing up strategic efforts in cyberinfrastructure with the 
humanities and social sciences in mind. The United States must make 
similar investments if we are to compete internationally—for students, 
corporate funding, and cultural impact. 


2.Develop public and institutional policies that foster openness and 
access. 


Addressed to: University presidents, boards of trustees, provosts, and 
counsels; university presses; funding agencies; libraries; scholarly societies; 
Congress 


Implementation: The Association of American Universities, in 
collaboration with other organizations such as the National Humanities 
Alliance, the Scholarly Publishing and Research Coalition, and the National 
Academy of Arts and Sciences, should take a leadership role in 
coordinating the engagement of the humanities and social sciences with 
issues of information policy. 


Open access is critical to constructing and deploying meaningful 
cyberinfrastructure, and it will be important for the humanities and social 
sciences to engage in active dialogue and then to lobby effectively 
concerning legislative and policy developments in this area—for example, 
in support of the Federal Research Public Access Act of 2006. The Open 
Content Alliance offers one good platform for the dialogue the Commission 
wishes to promote; it lists as its members a number of libraries and 
museums as well as commercial content providers, software companies, and 
search engine companies. We encourage scholarly societies and university 
presses—currently unrepresented—to join the Alliance. [footnote] 
http://www.opencontentalliance.org/index.html(30 April 2006). 


The Commission also strongly encourages the funders of research in the 
humanities and social sciences to require from applicants a plan for sharing 
and preserving data generated using grant funding, and we urge universities 
with commercial digitization partners to address long-term ownership and 
access issues when creating those partnerships. We also call on university 
counsels, boards of trustees, and provosts to provide aggressive support for 
the principles of fair use and open access, and to promote awareness and 
use of Creative Commons licenses. | footnote]We call on senior academic 
leaders to ensure that their own practices (as producers of intellectual 
property and as editors of journals) and the practices of university presses, 
libraries, and museums support fair use and open access. And, finally, the 
Commission calls on scholarly societies and universities to advocate that 
Congress redress imbalances in intellectual property law that currently 
prevent or inhibit preservation, discourage scholarship, and restrain 
research and creativity. 
http://www.creativecommons.org/licenses/by-nc-sa/2.5. 


Laws, policies, and conventions surrounding copyright and privacy are an 
implicit part of the cyberinfrastructure in the social sciences and 
humanities. We must align current law with the new realities of digital 
knowledge environments. Laws that support these knowledge environments 
must take into account the characteristics of digital content and the practices 
that make that content productive. The recent effort of the Copyright Office 
to address the problem of “orphan works”—works with uncertain copyright 
status, which therefore cannot be used with impunity by scholars and others 
—is a welcome example of a key agency in this debate taking an 
appropriate leadership role. [footnote|We urge Congress to pass legislation 
that adopts the statutory language recommended by the Register of 
Copyrights in her report. Another example of such leadership is the Library 
of Congress’s current study of Section 108 of the copyright code and its 
implications for preservation. 

To read the Copyright Office’s report, see 
http://www.copyright.gov/orphan/. For a general overview, see Scott 
Carlson, “Whose Work Is It, Anyway?” The Chronicle of Higher Education 
(29 July 2005) http://chronicle.com/free/v51/i47/47a03301.htm. 


The Commission can offer no simple solutions to complex issues of 
intellectual property. Scholars, after all, create as well as use intellectual 
property and so are on both sides of these contentious debates. But 
researchers have traditionally embraced openness and sharing, and that 
spirit should be encouraged and facilitated in the digital environment. They 
should not be intimidated by the efforts of rights holders to restrict valid 
educational uses of materials. Scholars should, for example, be encouraged 
to take full advantage of the “fair use” provisions of the copyright laws. 


While scholars advocate public and legal policies of openness and access, 
they similarly must advocate these policies within their own communities to 
the greatest extent practically and legally possible. The Massachusetts 
Institute of Technology’s Open CourseWare is an interesting and instructive 
example at the level of the core instructional activities of faculty: it freely 
distributes course materials. Universities need to consider the impact of 
their technology transfer and intellectual property policies; university 
presses and scholarly societies need to envision creative dissemination 
models that reflect academic values, and then lobby for the actual resources 
needed to realize those models; museums heed to make their digitized 
surrogates freely available, as they already increasingly do. All parties 
should work energetically to ensure that scholarship and cultural heritage 
materials are accessible to all—from a student preparing a high-school 
project to a parent trying to understand the issues in a school-board debate 
to a tourist wanting to understand Rome’s art and architecture. 


3.Promote cooperation between the public and private sectors. 


Addressed to: Universities; federal and private funding agencies; Internet- 
oriented companies 


Implementation: A private foundation, a federal funding agency, an Internet 
business, and one or more university partners should cosponsor recurring 
annual summits to explore new models for commercial/nonprofit 
partnerships and to discuss opportunities for the focused creation of digital 
resources with high educational value and high public impact. 


Universities and those who fund them (privately or publicly) need to 
reallocate resources to support digital cultural activities and develop new 
financial models for making those activities sustainable. For-profit 
companies that work with digital cultural heritage materials or publish 
humanities and social-science research need to address long-term 
preservation and access issues. 


Nearly every discussion in the course of the Commission’s investigations 
emphasized the urgent need for new funding and new models of financial 
sustainability to fund certain core areas, such as preservation and curation 
of cultural materials, innovative research in the humanities and social 
sciences, electronic publication, and development of tools and resources for 
classroom use. Recent partnership agreements between research university 
libraries and Google represent one model of financial sustainability, 
although some question the long-term harmony of interests and missions in 
these partnerships. Even if such questions persist, continued 
experimentation with new forms of cooperation between the private sector 
and cultural institutions remains of utmost importance. Commercial and 
nonprofit partnerships are possible, and commercial investment has often 
benefited scholarship and the dissemination of cultural heritage content in 
North America. [footnote]Such partnerships can contribute a great deal to 
innovation as well as promote entrepreneurial engagement in challenges 
(such as digitization) that the cultural sector will be unable to address by 
itself. 

The American Antiquarian Society, for example, the leading repository of 
pre-1800 printed Americana, has enjoyed a business partnership with 
ReadEx-Newsbank for 50 years, a partnership that has resulted in the 
investment of millions of dollars in digitizing and disseminating the cultural 
record of early America. 


Still, there will always be scholarship, teaching, and research that can be 
conducted only with public subsidy, either directly from the government or 
from tax-exempt private philanthropy. Government funding agencies, most 
notably the National Endowment for the Humanities (NEH) and the 
Institute of Museum and Library Services (IMLS), should continue their 
support of digital projects, including digital tools and other elements of the 
cyberinfrastructure. We believe that increased support from the National 


Science Foundation (NSF) for work in the digital humanities will benefit 
both the humanities and computer science. The recent joint initiative of the 
NEH, NSF, and Smithsonian Institution to fund the documentation of 
endangered languages demonstrates that such a partnership can succeed. 
[footnote]Other areas of digital library development should be cosponsored 
with federal agencies such as the Library of Congress, IMLS, Smithsonian, 
National Archives and Records Administration, NSF, and National Institute 
of Standards and Technology. 

National Science Foundation 
http://www.nsf.gov/pubs/2004/nsf04605/nsf04605.htm. 


The Andrew W. Mellon Foundation is a both a leader in and a leading 
funder of the application of digital technologies to the humanities and social 
sciences. The William and Flora Hewlett Foundation, the Packard Institute 
for the Humanities, the Rockefeller Foundation, and others have also 
provided support to critical initiatives. While many other private funding 
agencies have supported digital projects, these efforts have not so far been 
coordinated purposefully to achieve the kind of cyberinfrastructure 
envisioned in this report. 


4.Cultivate leadership in support of cyberinfrastructure from within 
the humanities and social sciences. 


Addressed to: Senior scholars; scholarly societies; university 
administrators; senior research librarians and research library organizations; 
academic publishing organizations; federal funding agencies; private 
foundations 


Implementation: Increase federal and foundation funding to one or more 
scholarly organizations in the area of humanities and social science 
computing so that they can work with member organizations of the 
American Council of Learned Societies (ACLS) and others to establish 
priorities for cyberinfrastructure development, raise awareness of research 
and partnership opportunities among scholars, and coordinate the evolution 
of research products from basic to applied. 


Librarians, rather than scholars, have provided much of the recent 
leadership within the academy on issues of cyberinfrastructure in the 
humanities and social sciences. Reflecting the conservative culture of 
scholarship, some scholars have questioned librarians’ investments in 
building digital collections and acquiring online resources. Given that the 
library constitutes the historic infrastructure of scholarship, it is entirely 
appropriate that librarians have sought to re-ignite scholarly engagement 
with infrastructural issues. Nevertheless, others now need to take up the 
cause and shoulder their leadership responsibilities. As the task force of the 
American Association of Universities indicated in its 2004 report 
Reinvigorating the Humanities, “[u]niversity presidents, provosts and 
humanities deans” must “support the development and use of digital 
information and technology in the humanities.” [footnote | 

American Association of Universities, Reinvigorating the Humanities: 
Enhancing Research and Education on Campus and Beyond (Washington, 
DC: American Association of Universities, 2004), TV 59-69 
http://www.aau.edu/issues/HumRpt.pdf. 


Leadership requires structure. Humanities organizations, in particular, 
should develop new means of sharing information and setting agendas. 
Again, the example of the library community is instructive. The Association 
of Research Libraries (ARL); Council on Library and Information 
Resources (CLIR); and Online Computer Library Center (OCLC), which is 
about to absorb the Research Libraries Group, have made technological 
transformation central to their missions and programming. They have, in 
turn, created vehicles—the Coalition for Networked Information, the 
Digital Library Federation, the Scholarly Publishing and Academic 
Resources Coalition—dedicated entirely to providing leadership on these 
issues. Very few cognate efforts exist in the humanities and social sciences. 
The Alliance of Digital Humanities Organizations (ADHO), H-Net, and the 
Humanities, Arts, Science, and Technology Advanced Collaboratory 
(HASTAC) are three examples, but these have not enjoyed the kind of 
financial support from the humanities and social sciences communities that 
ARL, CLIR, OCLC, or RLG have received from the research library 
community. Scholarly societies have a special role to play in providing— 
and funding—similar leadership for scholars in the humanities and social 
sciences. 


At the campus level, university administrators should go out of their way to 
ensure that representatives from the social sciences and humanities are at 
the planning table alongside librarians, scientists, and engineers when issues 
of cyberinfrastructure are being decided. All too often, humanists and social 
scientists learn about policy and funding decisions after they are made. By 
the same token, scholars in the humanities and social sciences must not 
hesitate to insist on being included in these discussions and decisions. 


5.Encourage digital scholarship. 


Addressed to: Universities; research libraries; the National Endowment for 
the Humanities (NEH); the National Endowment for the Arts (NEA); the 
Institute of Museum and Library Services (IMLS); the National Academies; 
the National Archives; major private foundations; major scholarly societies; 
individual leaders in the humanities and social sciences 


Implementation: Federal funding agencies and private foundations should 
establish programs that address workforce issues in digital humanities and 
social sciences, from short-term workshops to postdoctoral and research 
fellowships to the cultivation of appropriately trained computer 
professionals. The ACLS should lead its member organizations in 
developing uniform policies with respect to digital scholarship in tenure and 
promotion. 


The Commission believes that digital scholarship is the inevitable future of 
the humanities and social sciences, and that digital literacy is a matter of 
national competitiveness and a mission that needs to be embraced by 
universities, libraries, museums, and archives. In order to foster digital 
research, teaching, and publishing, we recommend specifically that there be 


¢ fellowship and research leave for digital scholarship and for 
collaborative research projects in laboratories that take full advantage 
of cyberinfrastructure; 

¢ policies for tenure and promotion that recognize and reward digital 
scholarship and scholarly communication; recognition should be given 
not only to scholarship that uses the humanities and social science 


cyberinfrastructure but also to scholarship that contributes to its 
design, construction, and growth; 

e workshops aimed at introducing scholars and teachers to the methods 
and possibilities of digital scholarship and giving them the opportunity 
to develop their own creative ideas in the context of 
cyberinfrastructure; [footnote] 

e workshops that bring scholars and technologists together around a set 
of goals and that forge working partnerships with computer scientists 
and engineers; 

¢ university support for software, data storage, and technical support for 
librarians and computer professionals. 


We might expect younger colleagues to use new technologies with greater 
fluency and ease, but with tenure at stake, they will also be more risk- 
averse. There is a widely shared perception that academic departments in 
the humanities and social sciences do not adequately reward innovative 
work in digital form. A handful of recent examples provide exceptions to 
the norm, but in the most elite universities, traditional scholarly work, in the 
form of a single-authored, printed book or article published by a university 
press or scholarly society, is the currency of tenure and promotion; work 
online or in new media—especially work involving collaboration—is not 
encouraged. Senior scholars now have both the opportunity and the 
responsibility to take certain risks, first among which is to condone risk 
taking in their junior colleagues and their graduate students, making sure 
that such endeavors are appropriately rewarded. 


How will younger scholars in the humanities and social sciences engage 
these new technologies and methods? Experience demonstrates that some 
will find a way of their own, but it also suggests that if more than a few are 
to pioneer new digital pathways, more formal venues and opportunities for 
training and encouragement are needed. The Commission recommends the 
creation of brief (one- to three-week) workshops for younger scholars— 
perhaps located at some of the existing centers in the digital humanities and 
social sciences and organized in conjunction with scholarly societies— 
focusing on how to do research, how to present the products of scholarship, 
and how to teach in the digital era. One model could be the Canadian Social 


Sciences and Humanities Research Council’s Image, Sound, Text and 
Technology Institute Program, which provides grants for such workshops. 
[footnote]A recent workshop on digital scholarship offered only to younger 
scholars in one very specific domain—the history of science and 
technology—found itself vastly oversubscribed. [footnote|But we should 
not neglect training opportunities for midcareer scholars who wish to learn 
about new tools, resources, and approaches. 

Canadian Social Sciences and Humanities Research Council 

The workshop, offered by the Center for History and New Media at George 
Mason University with funding from the Sloan Foundation, had 75 
applicants for 15 slots. 


It is also important to remember that students, and often their teachers, need 
help in making sense of what they find. For example, a 1930s photograph of 
sharecroppers, with the imprimatur of the Library of Congress’s American 
Memory site, may seem to be a transparent reflection of social and 
historical reality rather than a created and composed artifact with a larger 
political message. We recommend that resources be devoted to making 
students (and citizens) into sophisticated and critical consumers of the vast 
cultural heritage that has been placed at their fingertips. Some of this can be 
done electronically, but workshops for K-12 teachers who use the Web in 
their classrooms are badly needed as well. 


6.Establish national centers to support scholarship that contributes to 
and exploits cyberinfrastructure. 


Addressed to: Universities; Congress; state legislatures; public funding 
agencies; private foundations 


Implementation: Universities should develop national and international 
fellowships at existing humanities and social science computing centers, 
and develop new centers with such programs, with a combination of 
university, federal, and private funding. 


A robust cyberinfrastructure should include centers that support 
collaborative work with specialized methods. When human, institutional, or 
technical resources become too expensive to replicate at every institution, it 
makes sense to provide those resources through a more limited number of 
national centers. This is what has already been done in the sciences, and it 
is what should also be done in the humanities and social sciences. Public 
funds should be at the forefront of support to such national centers of 
excellence in digital humanities and social science, as crucial seedbeds of 
further innovation. 


The humanities and social science cyberinfrastructure should include a 
network of such centers distributed around the country. Centers might focus 
on particular methods or tools—for example, the application of Geographic 
Information Systems or data-mining or visualization to humanities and 
social science research problems. Centers might also, in some cases, be 
devoted to research involving copyrighted digital materials or research 
involving confidential social science data. The Inter-University Consortium 
for Political and Social Research (ICPSR) is one such national center in the 
social sciences; the Vanderbilt Television News Archive might be taken as 
an example or a Starting point with respect to copyrighted material. The 
Library of Congress’s NDIIPP (National Digital Information Infrastructure 
Preservation Program) partnerships are exploring the creation of data 
centers to serve other communities, using a range of business models. 


Universities should foster interdisciplinary laboratories and research groups 
that include both technical and subject expertise. “Once humanities faculty 
began using the laboratory in their research,” Stanford University computer 
scientist Marc Levoy told the Commission, “they would also find creative 
ways to fold its technology into their teaching—for example, through 
project-based assignments in upper-level courses. This would bring 
humanities students into the lab, some of whom have dual backgrounds, and 
so could help run the lab.” Provost James O’ Donnell of Georgetown 
University, speaking to the Commission, advocated “zones of 
experimentation and innovation for humanists.” O’ Donnell added that those 
zones should be “part and parcel of the formal academic structure. Ghettos 
are not the answer. We need instead the creation of privileged but open 
communities, where the very best young people are challenged to invent, 


experiment, break things, and succeed.” Exemplary models of such centers 
include the American Social History Project/Center for Media and Learning 
at the City University of New York; the Center for History and New Media 
at George Mason University; MATRIX, the Center for Humane Arts Letters 
and Social Sciences at Michigan State University; and the Institute for 
Advanced Technology in the Humanities at the University of Virginia. The 
National Center for Supercomputing Applications at the University of 
Illinois has recently shown interest in arts, humanities, and social sciences, 
and its involvement in this effort would be most welcome. [footnote | 

The American Social History Project/Center for Media and Learning 
http://www.ashp.cuny.edu/; Center for History and New Media 
http://chnm.gmu.edu/; Institute for Advanced Technology in the Humanities 
http://jefferson.village.virginia.edu/; National Center for Supercomputing 
Applications http://www.ncsa.uiuc.edu/. 


7.Develop and maintain open standards and robust tools. 


Addressed to: Funding agencies, public and private; scholars; librarians; 
curators; publishers; technologists 


Implementation: University consortia such as the Committee on 
Institutional Cooperation should license the SourceForge software and 
make it available to open-source developers in academic institutions. The 
National Endowment for the Humanities (NEH), National Archives and 
Records Administration (NARA), and Institute of Museum and Library 
Services (IMLS) should support the development, maintenance, and 
coordination of community-based standards such as the Text Encoding 
Initiative, Encoded Archival Description, Metadata Encoding and 
Transmission Standard, and Visual Resources Data Standards. The National 
Science Foundation (NSF), the Andrew W. Mellon Foundation, IMLS, and 
other funding agencies should support the development of tools for the 
analysis of digital content. 


Scholars in the humanities and social sciences should work with librarians, 
curators, publishers, and technologists to develop tools for producing, 
searching, analyzing, vetting, and representing knowledge, as well as 


standards for documenting data of all kinds. For hundreds of years, the most 
important tools of humanists and social scientists were pen or brush and 
paper. Today, scholars require a range of digital tools for research, teaching, 
and writing, including tools for finding, filtering and reviewing, processing 
and organizing, annotating, analyzing, and visualizing digital information. 
Even though we can point to current efforts in many of these areas, lack of 
coordination among them is a problem: a great deal of tool building is done 
on a local scale, and this results in unnecessary redundancy of effort. 
[footnote] 


title=Main Page. 


In part, this is because academic software developers may be prohibited by 
their university counsels from participating in open-source communities 
such as SourceForge (not because of any university opposition to open- 
source but, instead, because of statutory prohibitions against accepting the 
terms of use that these communities impose, especially regarding issues 
such as indemnification and governing law in the resolution of disputes). In 
that case, it is incumbent on the university community to provide and 
encourage the use of a parallel community infrastructure for open-source 
software development, in order to avoid duplication of effort and ensure 
that tool builders in academic settings are not specially disadvantaged 
compared with tool builders outside universities. Such an effort could begin 
with a consortium of major universities (for example, the Committee on 
Institutional Cooperation) licensing the SourceForge software and then 
making it available for use by academic open-source software developers 
on acceptable terms. 


Tools developed in one discipline may frequently be transferable or 
adaptable to other disciplines, but scholars may be unaware of tools 
developed outside their own discipline. Libraries, archives, and museums 
are positioned to serve as bridges among the sciences, humanities, social 
sciences, and arts in integrating widely disparate information and building 
new interdisciplinary relationships. The library of the University of 
California, Riverside, for example, is conducting research aimed at 
producing better machine-based, automatically generated metadata to 
improve the search and retrieval of multidisciplinary online content. 


[footnote |The Museums and Online Archives Collaboration Community 
Toolbox, developed by the California Digital Library, will enable museums 
and libraries to produce standards-based data for broad content sharing. 
[footnote] 

University of California, Riverside http://infomine.ucr.edu/. 

California Digital Library 

http://www.cdlib.org/inside/news/building collections.ppt. 


With respect to open standards, commercial entities that create significant 
digital collections (such as Google with its digitization of collections from 
five major U.S. research libraries) should produce at least one version of the 
resource in a nonproprietary format, if only for deposit with and local use 
by the institution that holds the originals being digitized—and universities 
should speak with a stronger voice on that point. Funding agencies— 
including the NSF, NEH, NARA, NDIIPP, and IMLS—and academic 
leaders should support the development and maintenance of digital tools 
and increase direct funding for the development and documentation of 
standards that improve the preservation and interoperability of digital 
content in the humanities and social sciences. Such support should include 
the development of opportunities for collaboration among tool builders and 
between tool builders and standards organizations, as well as scholarly 
validation of the tools and standards they use. The NEH, NARA, and IMLS 
should coordinate support for standards activity and should harmonize these 
efforts with the parallel tool- and resource-building activities of 
organizations such as the Digital Library Federation. 


New approaches are necessary to capture and integrate digital resources 
from different kinds of cultural heritage organizations, which have followed 
very different practices in describing and organizing their collections, and 
to maintain the intellectual context of collections when they are digitized. A 
research project at the University of Illinois, Urbana-Champaign, has 
created a collection-level registry and item-level repository, based on the 
Open Archives Initiative Metadata Harvesting Protocol, that allows 
browsing of collection descriptions as well as content searching within and 
across collections. The project also serves as a testbed for research to 
improve the development of integrated, large-scale multidisciplinary digital 
libraries. [footnote ]When best practices are identified, projects of this type 


can be scaled up to contribute to the “Global Digital Library.” 
Interoperability in software and in data is never perfect, but, in both cases, it 
has a better chance of emerging when information about those resources is 
open, easy to find, and readily reusable. Interoperability across the 
humanities and social science cyberinfrastructure therefore requires the 
continued development and promotion of vendor-independent, open 
standards for document modeling and data documentation as well as open- 
source methods for software development. 

University of Illinois, Urbana-Champaign, Digital Collections and Content 
http://imlsdcc. grainger.uiuc.edu/. 


Humanists and social scientists and their organizations must build the tools 
and standards they need: others will not do it for them. The summit on 
Digital Tools for the Humanities, supported by the NSF and held at the 
University of Virginia in September 2005, is a promising first step toward 
improving coordination in developing tools. The Andrew W. Mellon 
Foundation has also been funding the development of open-source tools. 
The Text Encoding Initiative Consortium is a long-standing and exemplary 
community-based standards organization focused on literary and linguistic 
texts, their uses, and their users. 


8.Create extensive and reusable digital collections. 


Addressed to: TheNational Endowment for the Arts (NEA), the National 
Endowment for the Humanities (NEH), the Institute of Museum and 
Library Services (IMLS), the National Archives and Records 
Administration (NARA), and other funding agencies, both public and 
private; scholars; research libraries and librarians; university presses; 
commercial publishers 


Implementation: National centers with a focus on particular methods or 
disciplines can organize a certain amount of scholar-driven digitization. 
Library organizations and libraries should sponsor discipline-based focus 
groups to discuss priorities with respect to digitization. When priorities are 
established, these should be relayed to the organizers of annual meetings on 
commercial and nonprofit partnerships, and they should be considered in 


the distribution of grant funds by federal agencies and private foundations. 
Funding to support the maintenance and coordination of standards will 
improve the reusability of digital collections. The NEA, NEH, and IMLS 
should work together to promote collaboration and skills development— 
through conferences, workshops, and/or grant programs—for the creation, 
management, preservation, and presentation of reusable digital collections, 
objects, and products. 


The extensive digitization of cultural heritage materials is one of the most 
exciting developments in the humanities and social sciences in the past 
century, and it should be continued and expanded through a thoughtful 
combination of institutional, public, and private support. The Commission 
believes that scholars have an important role to play in the development of 
commercial and nonprofit digital archives alike, and neither research 
libraries nor companies such as Google have yet gone far enough to 
encourage dialogue with the scholarly community on such questions as the 
selection of materials for digitization, decisions about what to omit from the 
digitized representation, or the design of descriptive metadata. 


We support efforts such as the Million Book Project, Project Gutenberg, the 
Open Content Alliance, and other noncommercial digitization projects. 
These might include efforts to digitize the archives of public broadcasting 
(the Public Broadcasting System [PBS] and others in the United States; the 
British Broadcasting Corporation [BBC] in the United Kingdom). More 
broadly, the Commission recognizes the importance of the cultural 
institutions whose collections are being digitized in these alliances and 
projects: scholarship and public understanding of the cultural record rely on 
museums, libraries, archives, and cultural institutions in general. The record 
that they preserve is the fundamental dataset for cultural research and 
education, and it is critical that they be engaged with scholars and educators 
in all disciplines, not only in creating interoperable and reusable digital 
content, but also to ensure that scholarly work in digital formats being 
produced today remains accessible in the future. The Walt Whitman 
Archive, spearheaded by the University of Nebraska, Lincoln, libraries, is 
creating a model metadata- encoding-and-transmission-standard (METS) 
profile for digital thematic research collections, integrating high-quality 
data and metadata, in-depth description, high-resolution files, and encoded 


texts. Created by scholars in collaboration with librarians and archivists, 
this model project enables creators of digital thematic research collections 
to make their work more sustainable and universally usable. [footnote]The 
Institute of Museum and Library Services has supported the development of 
A Framework of Guidance for Building Good Digital Collections, 

[footnote |which establishes principles for the creation, preservation, and 
management of digital collections and objects and is now maintained by the 
National Information Standards Organization. Likewise, Cataloging 
Cultural Objects, [footnote]a tool developed by the Visual Resources 
Association with input from the library, archives, and museum 
communities, promotes good descriptive practices across disciplines. These 
kinds of tools should be continued and expanded. 

The Walt Whitman Archive http://www.whitmanarchive.org/, 
http://www.niso.org/framework/Framework2.html. 
http://www.vraweb.org/ccoweb/index.html. 


The Commission endorses efforts such as the Digital Promise Project 
(www.digitalpromise.org), which aims to provide public support for the 
digitization of collections unlikely to attract commercial investment. 
Ambitious projects such as those undertaken by Google should not allow us 
to forget about the continued need for investment from the public and 
nonprofit sector. One recent and carefully reasoned estimate suggests that 
Google Book Search represents only about a third of the books held in 
research libraries—and there are many forms other than books in which the 
cultural record is purveyed, and many books not held by research libraries. 
[footnote|In public and nonprofit digitization efforts, priority must be 
placed on those collections that commerce is unlikely to fund. They will 
probably be collections held by institutions that are content-rich and 
technology-poor, such as historically black colleges and universities, which 
are custodians of vast and important collections documenting the lives and 
heritage of African Americans. 

Brian Lavoie et al., “Anatomy of Aggregate Collections: The Example of 
Google Print for Libraries,” D-Lib Magazine 11:9 (September 2005) 
http://www.dlib.org/dlib/september05/lavoie/O9lavoie.html. 


The Commission also encourages continued investment in this area by the 
National Endowment for the Humanities, the Institute of Museum and 


Library Services, the National Archives, the Andrew W. Mellon 
Foundation, and other funding agencies, both public and private. In 
addition, we recommend that scholars and university presses cooperate with 
commercial digitization efforts with the goal of ensuring that they are as 
well designed and widely accessible as possible. Scholars should participate 
in institutional repository programs, and universities should develop 
programs at the national level to share digital content for teaching and 
research and to coordinate and share successful practices for working with 
digital resources. Institutional repositories should plan and be funded for the 
long-term preservation and migration of data. 


The general public, students, teachers, and scholars want to have online 
access to the full range of primary source materials housed in repositories 
such as museums, historical societies, local libraries and research libraries, 
special collections, archives, and privately held collections. This includes 
books and journals, newspapers and magazines, government documents, 
manuscripts, maps, photographs, satellite images, census data, recorded 
sound, film, broadcast television, and Web content. Information technology 
offers ways to reunite dispersed collections, as in the International 
Dunhuang Project, [footnote |which makes information and images of more 
than a hundred thousand manuscripts, paintings, textiles, and other artifacts 
from Dunhuang and other Silk Road sites freely available on the Internet; to 
compare exemplars (for example, the Shakespeare quartos [footnote ]or the 
many variants of the Roman de la Rose [footnote]); to assemble the works 
of single creators, such as the photographs of Mathew Brady; [footnote Jor 
to aggregate disparate examples pertaining to a single theme, such as the 
University of Nebraska Press’s Gallery of the Open Frontier, with 23 
thousand images of the American West. [footnote]We have only begun to 
realize the potential of networked cultural heritage information. 

British Library, International Dunhuang Project (2006) http://idp.bl.uk/. 
British Library, Treasure in Full: Shakespeare in Quarto 


Johns Hopkins University and the Pierpont Morgan Library, Roman de la 
Rose http://rose.mse.jhu.edu/, 

Library of Congress, Selected Civil War Photographs (2000) 
http://memory.loc.gov/ammem/cwphtml/cwphome.html. 
http://gallery.unl.edu/index.html. 


Conclusion 


Conclusion 


We should place the world’s cultural heritage—its historical documentation, 
its literary and artistic achievements, its languages, beliefs, and practices— 
within the reach of every citizen. The value of building an infrastructure 
that gives all citizens access to the human record and the opportunity to 
participate in its creation and use is enormous, exceeding even the 
significant investment that will be required to build that infrastructure. The 
Commission is also keenly aware that in order for the future to have a 
record of the present, we need legal and viable strategies for digital 
preservation; considerable investment is now required on that front as well. 
Investments need to be made on the basis of research, and, in this case, a 
good deal more research is needed on digital preservation, tools, and uses 
and users of digital collections, in academic settings and beyond. [footnote | 
Some research is already being done. At the University of California, 
Berkeley, e.g., a two-year “Digital Resource Study” is looking at the “use of 
digital resources in undergraduate education in the humanities and social 


But this is only part of the realization that the Commission hopes to leave 
with readers of this report. In a recent public presentation of the draft 
findings of this report, the Commission’s chair was asked, “If your report 
were a complete success, what would be the result, five or six years from 
now?” The answer is two-fold. First, if this report’s recommendations are 
implemented, then in five or six years, there will be a significantly 
expanded audience for humanities and social science research among the 
general public. A relatively small audience on the open Web will still be a 
far larger audience than scholars in these disciplines have been able to find 
up to now in academic bookstores, research libraries, and print journals. 
Second, if the recommendations of this report are implemented, humanities 
and social science researchers five or six years from now will be answering 
questions that today they might not even consider asking. 


The Commission understands that increasing access to scholarly research 
and experimenting with new research methods both entail some risk, but it 


firmly and collectively believes that the risk of not doing both is far greater, 
in terms of the ultimate sustainability of the disciplines in question. Senior 
scholars in the humanities and social sciences and senior administrators in 
research universities must lead the way to a new, more open, and more 
productive relationship with the public, and to new ways of doing 
scholarship. 


Appendix I: The Charge to the Commission 


Appendix I: The Charge to the Commission 


As scholars in the humanities and social sciences use digital tools and 
technologies with increasing sophistication and innovation, they are 
transforming their practices of collaboration and communication. New 
forms of scholarship, criticism, and creativity proliferate in arts and letters 
and in the social sciences, resulting in significant new works accessible and 
meaningful only in digital form. Many technology-driven projects in these 
areas have become enormously complex and, at the same time, 
indispensable for teaching and research. 


For their part, scientists and engineers no longer see digital technologies 
merely as tools enhancing established research methodologies but as forces 
creating environments that enable the creation of new knowledge. The 
recent National Science Foundation report “Revolutionizing Science and 
Engineering through Cyberinfrastructure” argues for large-scale 
investments across all disciplines to develop a shared technology 
infrastructure that will support ever-greater capacities. Those capacities 
would include the development and deployment of new tools; the rapid 
adoption of best practices; interoperability; the ability to invoke services 
over the network; secure sharing of facilities; long-term storage of, and 
access to, important data; and ready availability of expertise and assistance. 


The needs of humanists and scientists converge in this emerging 
cyberinfrastructure. As the importance of technology-enabled innovation 
grows across all fields, scholars are increasingly dependent on sophisticated 
systems for the creation, curation, and preservation of information. They are 
also dependent on a policy, economic, and legal environment that 
encourages appropriate and unimpeded access to both digital information 
and digital tools. It is crucial for the humanities and the social sciences to 
join scientists and engineers in defining and building this infrastructure so 
that it meets the needs and incorporates the contributions of humanists and 
social scientists. 


ACLS is sponsoring a national commission to investigate and report on 
these issues. The Commission will operate throughout 2004 and is charged 
to 


e describe and analyze the current state of humanities and social science 
cyberinfrastructure; 

e articulate the requirements and potential contributions of the 
humanities and the social sciences in developing a cyberinfrastructure 
for information, teaching, and research; 

e recommend areas of emphasis and coordination for the various 
agencies and institutions, public and private, that contribute to the 
development of this cyberinfrastructure. 


Among the questions to be explored in pursuing these three goals are: 


Describe and analyze the current state of humanities and social science 
cyberinfrastructure. 


1. What can be generalized from the already significant digital projects in 
the humanities and social sciences? Which humanities and social 
science communities are most active, and why? Of those that are not, 
which might soon, easily and/or profitably, engage more deeply with 
digital technology? How have scholars developed computing 
applications to accomplish their scholarly and expressive goals? 
Where have they failed to do so, and what can be learned from those 
failures? 

2. What new intellectual strategies, critical methods, and creative 
practices are emerging in response to technical applications in the 
humanities? To what extent are disciplines in the humanities 
transforming themselves through the use of computing and networking 
technologies? What are the implications of those transformations? 

3. What organizations and structures have empowered or impeded the 
digital humanities? What are examples of successful and durable 
collaboration between technologists and humanities scholars? Where 
and how are people being trained to support and engage in such 
collaborations? What has been the role of libraries, archives, and 
publishers in these projects? 


Articulate the requirements and the potential contributions of the 
humanities and the social sciences in developing a national 
cyberinfrastructure for information, teaching, and research. 


1. What are the "grand challenge" problems for the humanities and social 
sciences in the coming decade? Are they tractable to computation? Do 
they require cyberinfrastructure in some other way? 

2. What technological developments can we predict that will have special 
impact in the humanities and social sciences in the near future? 

3. Which are the most important functionalities necessary for new 
research and development in cyberinfrastructure generally? What 
kinds of humanities or social science problems are theoretically 
difficult or expressively complex, or challenge our ability to formulate 
a computable problem in some other way? What kinds of humanities 
or social science problems are computationally intensive, require 
especially high bandwidth, or present resource challenges in other 
ways? 

4. What are the barriers that confront humanities and social science users 
who wish to take advantage of state-of-the-art computational, storage, 
networking, and visualization resources in their research? What can be 
done to remove these barriers? 

5. What impact will the availability of high-performance infrastructure 
have on enabling cross-disciplinary research? What will high- 
performance infrastructure mean for the broader social impact of 
humanities and social sciences? 

6. What can be done to improve education and outreach activities in the 
computer-science and engineering community to broaden access to 
high-end computing? How can computing expertise in the humanities 
and social sciences themselves be increased? 


Recommend areas of emphasis and coordination for the various agencies 
and institutions, public and private, that contribute to the development of 
humanities cyberinfrastructure. 


1. What investments in cyberinfrastructure are likely to have the greatest 
impact on scholarship in the humanities and social sciences? 


2. What research infrastructure should be coupled with 
cyberinfrastructure? 

3. How can private and public funding agencies coordinate their efforts 
and cooperate with universities, research libraries, disciplinary 
organizations, and others to maximize the benefits of 
cyberinfrastructure for the humanities and social sciences? 

4. How should new investments in infrastructure and technologies be 
administered so as to include the humanities? 


Appendix IT: Public Information-Gathering Sessions 


Appendix II: Public Information-Gathering Sessions 


The ACLS Commission on Cyberinfrastructure for the Humanities and 
Social Sciences convened seven public information-gathering sessions to 
hear from those interested in contributing to the work of the Commission. 
Below is a record of those who testified at these public sessions, held 
throughout the country on the following dates. Transcripts of these 
testimonies are available on the ACLS Web site at: 


Tuesday, April 27th, 2004 — Washington, DC 


e Michael Jensen, National Academies Press 
e Joyce Ray, Institute of Museum and Library Services 
e Max Evans, National Historical Publications and Records Commission 


Saturday, May 22nd, 2004 — Chicago 


e William Barnett, Field Museum 

e James Grossman, Newberry Library 

¢ Myron P. Gutmann, University of Michigan, Ann Arbor 
¢ James Hilton, University of Michigan, Ann Arbor 

e Lorna Hughes, New York University 

e Martin Mueller, Northwestern University 

e Bill Regier, University of Illinois Press 


Saturday, June 19th, 2004 — New York 


e Stephen Brier, New Media Lab, CUNY Graduate Center 
e Diana Taylor, New York University 


e Kevin Guthrie, Ithaka Harbors 

Kate Wittenberg, Columbia University 
e Robert Darnton, Princeton University 

e Stanley N. Katz, Princeton University 


Saturday, August 21st, 2004 — Berkeley 


e Suzanne Calpestri, University of California, Berkeley 

e Henry Brady, University of California, Berkeley 

e Michael Buckland, Electronic Cultural Atlas Initiative (ECAIT) 
e Richard Rinehart, University of California, Berkeley 

¢ Geoffrey Nunberg, Stanford University 

e Gregory Niemeyer, University of California, Berkeley 

e John Ober, University of California, Berkeley 

e Marc Levoy, Stanford University 


Saturday, September 18th, 2004 — Los Angeles 


e Janice Reiff, University of California, Los Angeles 

e Kenneth Hamma, J. Paul Getty Trust 

e Jerry D. Campbell, University of Southern California 

¢ Douglas Greenberg, Survivors of the Shoah Visual History Foundation 

e David Theo Goldberg, University of California Humanities Research 
Institute 

¢ Zoe Borofsky, University of California, Los Angeles 


Tuesday, October 26th, 2004 — Baltimore 


e James J. O’Donnell, Georgetown University 

¢ David Greenbaum, The Interactive University Project, University of 
California, Berkeley 

e Fred Heath, University of Texas, Austin 


e Patricia Kosco Cossard, Medieval Academy of America, University of 
Maryland 

e Bernard Frischer, Institute for Advanced Technology in the 
Humanities, University of Virginia 


